CA2219314A1

CA2219314A1 - Apparatus and method for recreating and manipulating a 3d object based on a 2d projection thereof

Info

Publication number: CA2219314A1
Application number: CA 2219314
Authority: CA
Inventors: Amnon Shashua
Original assignee: Individual
Current assignee: Hexagon Metrology Israel Ltd
Priority date: 1995-04-25
Filing date: 1996-04-24
Publication date: 1996-10-31
Also published as: WO1996034365A1; JPH11504452A; CN1198230A; EP0832471A1; EP0832471A4; AU5667496A

Abstract

A method for generating information regarding a 3D object from at least one 2D
projection thereof. The method includes providing at least one 2D projection (40) of a 3D object, generating an array of numbers (50, 60) described by:
aijk = vi'bik - vj''ajk (i,j,k = 1,2,3), where ajk and bjk are elements of matrices A and B respectively and vi' and vi'' are elements of vectors v' and v'' respectively, wherein the matrices (50) and vectors (60) together describe camera parameters of three views (102) of the 3D object and employing the array to generate information regarding the 3D object (70).

Description

W 096/34365 PCTnUS96/05697 APPARATUS AND METHOD FOR ~C~TING AND MANIpuLATING

FIELD OF THE INVENTION

The present invention relates to apparatus and methods for processing 2D projectiOns of 3D objects and particularly to apparatus and methods for geometric analysis of a 2D projection image.

BACKGROUND OF THE INVENTION

Publications describing state of the art methods for processing 2D projections ~f 3D objects, and technologies relevant thereto, are described in references cited hereinbelow.
The disclosures of all publications mentioned in the speci~icatiOn and of the publications cited therein are hereby incorporated by reference.

SUBSTITUTE SHEET (RULE 2~) W 096/34365 PCTrUS~'0'~7 SUM~IARY OF THE INVENTION

The present invention seeks to provide image transfer apparatus and methods which are useful for generating a novel view of a 3D scene from first and second re~erence views thereof.
The present invention also seeks to provide 3D scene reconstruction methods and apparatus for generating a 3D
representation of a 3D scene from first, second and third views thereof.
The present invention also seeks to provide improved apparatus and methods for processing 2D projections of 3D ob-jects.
The present invention also seeks to provide methods for reconstruction of a 3D object based on a trilinear tensor defined on three views of the 3D object.
The present invention additionally seeks to provide methods for image transfer for a 3D object based on a trilinear tensor defined on three views of the 3D object.
The present invention also seeks to provide an image transfer method for generating a novel view of a 3D scene ~rom first and second reference views thereof, the method including providing first and second reference views of a 3D scene, employ-ing geometric information regarding the first re~erence view, second reference view and novel view, respectively, to generate a trilinear tensor representing the geometric relationship between the first, second and novel views and generating the novel view by computing a multiplicity of novel view locations each corre-sponding to different first and second corresponding locations in the first and second reference views respectively based on the first and second corresponding locations and the trilinear ten-sor.
The step of providing may, for example, comprise scan-ning in the first and second reference images.

SUBSTITUTE S~IEET (RULE 26) W 096/34365 PCTrUS96/05697 -The step of employing may, for example, include the step of generating a set of first, second and third corresponding locations in the first reference view, second reference view and novel view, respectively.
There is thus provided, in accordance with a preferred embodiment of the present invention, an image transfer method for generating a novel view of a 3D scene from first and second reference views thereof, the method including providing first and second reference views of a 3D scene, employing geometric infor-mation regarding the first reference view, second reference view and novel view, respectively, to generate a trilinear tensor representing the geometric relationship between the first, second and novel views, and generating the novel view by computing a multiplicity of novel view locations each corresponding to dif-ferent first and second corresponding locations in the first and second reference views respectivelY based on the first and second corresponding locations and the trilinear tensor.
Further in accordance with a preferred embodiment of the present invention, the step of providing includes scanning in the first and second reference images.
Still further in accordance with a preferred embodiment of the present invention, the step of employing includes the step of generating a set of first, second and third corresponding locations in the first reference view, second reference view and novel view, respectively.
There is also provided, in accordance with a preferred embodiment of the present invention, a 3D scene reconstruction method for generating a 3D representation of a 3D scene from first, second and third views thereof, the method including providing first, second and third views of a 3D scene, employing secmet~ic in~or~,~t}nn ~cga~in~ th~ flr~tj sec-or.d a~d third views to generate a trilinear tensor representing the geometric relationship between the first, second and third views, and generating a 3D representation of the 3D scene from the trilinear tensor.
Further in accordance with a preferred embodiment of SUBSTITUTE SHEET (RULE 2~i~

W 096/34365 P~llU~ 5G~7 the present invention, the step of generating a 3D representation includes computing an epipolar geometric representation of the first and second views from the trilinear tensor, and generating the 3D representation from the epipolar geometric representatiOn.
Also provided, in accordance with another pre~erred embodiment of the present invention, is image transfer apparatus for generating a novel view of a 3D scene from first and second reference views thereof, the apparatus including apparatus for providing first and second reference views of a 3D scene, a trilinear tensor generator operative to employ geometric informa-tion regarding the first reference view, second reference view and novel view, respectively, to generate a trilinear tensor representing the geometric relationship between the first, second and novel views, and a novel view generator operative to generate the novel view by computing a multiplicity of novel view loca-tions each corresponding to different first and second corre-sponding locations in the first and second reference views re-spectively based on the first and second corresponding locations and the trilinear tensor.
Further provided, in accordance with another pre~erred embodiment of the present invention, is 3D scene reconstruction apparatus for generating a 3D representation of a 3D scene from first, second and third views thereof, the apparatus including appparatus for providing ~irst, second and third views of a 3D
scene, a trilinear tensor generator operative to employ geometric information regarding the first, second and third views to generate a trilinear tensor representing the geometric relation-ship between the first, second and third views, and a 3D scene representation generator operative to generate a 3D representa-tion of the 3D scene from the trilinear tensor.
Also provided, in accordance with another preferred embodiment of the present invention, is a method for generating information regarding a 3D object from at least one and prefera-bly three 2D projections thereof, the method including providing at least one and preferably three 2D projections of a 3D object, generating an array of numbers described by:

SUBSTITUTE SHEET (RULE 26) W096/3436s PCT~S~'0'~7 ~ijk = Vi bjk ~ vj'~ aik (i,j,k = 1,2,3), where aij and bjk are elements of matrices A and B
respectively and vi' and vi" are elements of vectors v' and v"
respectively, wherein the matrices and vectors together describe camera parameters of three views of the 3D object, and employing the array to generate information regarding the 3D object.
Also provided, in accordance with a preferred embodi-ment of the present invention, is a method for generating a new view of a 3D object from first and second existing views thereof having n corresponding points Pi and Pil (i = 1 ... n), the method including generating a tensor ~ijk~ and plugging the values of points Pi and Pil (i = 1 ... n) into trilinear forms, thereby to extract an x", y" value representing a location in the néw view, and generating the new view on the basis of the result of the plugging in step.
Further provided, in accordance with a preferred embodiment of the present invention, is a method for reconstruct-ing a 3D object from at least one and preferably three 2D projec-tions thereof, the method including providing at least one and preferably three 2D projections of a 3D object, generating an array of numbers described by:
~ijk Vi bjk ~ vj~ aik (i,j,k = 1,2,3) where aij and bjk are elements of matrices A and B
respectively and vi' and vi" are elements of vectors v' and v"
respectively, wherein the matrices and vectors together describe camera parameters of three views of the 3D object, permuting the array into three homography matrices associated with three corre-sponding planes in 3D space, and employing the three homography matrices to reconstruct the 3D object.
Also provided, in accordance with a preferred embodi-ment of the present invention, is a visual recognition method including providing three perspective views of a 3D object be-tween which a trilinear relationships exists, and employing the trilinear relationship between the views in order to perform visual recognition by alignment.
Further in accordance with a preferred embodiment of SUBSTITUTE SHEET (RULE 26) PCTrUS9''05697 the present invention, the method also includes reprojecting the 3D object.
Still further in accordance with a preferred embodi-ment of the present invention, the information regarding the 3D
object includes a reconstruction of the 3D object.
Additionally in accordance with a preferred embodiment of the present invention, the information regarding the 3D object includes at least one new view of the 3D object generated without reconstructing the 3D object.
Further in accordance with a preferred embodiment of the present invention, the at least one and preferably three 2D
projections includes at least one aerial photograph.
Still further in accordance with a preferred embodi-ment of the present invention, the at least one and preferably three 2D projections includes at least one satellite photograph.
Further in accordance with a preferred embodiment of the present invention, the information regarding the 3D object comprises at least one coordinate of the 3D object.
Further in accordance with a preferred embodiment of the present invention, the 3D object includes an aerospace ob-ject.
Still further in accordance with a preferred embodi-ment of the present invention, the 3D object includes a large object such as a ship.
Further in accordance with a preferred embodiment of the present invention, the 3D object includes a nonexistent object.
Also provided, in accordance with a preferred embodi-ment of the present invention, is apparatus for generating infor-mation regarding a 3D object from at least one and preferably three 2D projections thereof, the apparatus including apparatus for providing at least one and preferably three 2D projections of a 3D object, an array generator operative to generate an.array of numbers described by:
~ijk Vi bjk ~ vj~ aik (i,j,k = 1,2,3) where aij and bjk are elements of matrices A and B

SUBSTITUTE SHEET (RULE 2~

W O9. 3~5 PCTAUS96/05697 respectively and vi' and vi" are elements of vectors v' and v"
respectively, wherein the matrices and vectors together describe camera parameters of three views of the 3D object, and an array analyzer employing the array to generate information regarding the 3D object.
Also provided, in accordance with a preferred embodi-ment of the present invention, is apparatus for generating a new view of a 3D object from first and second existing views thereof having n corresponding points Pi and Pil (i = 1 ... n), the apparatus including apparatus for generating a tensor ~ijk~ and apparatus for plugging the values of points Pi and Pil (i 1 ... n) into trilinear forms, thereby to extract an x", y" value representing a location in the new view, and apparatus for gener-ating the new view on the basis of the result of the plugging in step.
Also provided, in accordance with a preferred embodi-ment of the present invention, is apparatus for reconstructing a 3D object from at least one and preferably three 2D projections thereof, the apparatus including apparatus for providing at least one and preferably three 2D projections of a 3D object, an array generator operative to generate an array of numbers described by:
~ijk Vi bjk ~ vj~ aik (i,j,k = 1,2,3) where aij and bjk are elements of matrices A and B
respectively and vi' and vi" are elements of vectors v' and v"
respectively, wherein the matrices and vectors together describe camera parameters of three views of the 3D object, an array permutator operative to permute the array into three homography matrices associated with three corresponding planes in 3D space, and a 3D object reconstructor operative to employ the three homography matrices to reconstruct the 3D object.
Further provided, in accordance with a preferred embodiment of the present invention, is visual recognition appa-ratus including apparatus for providing three perspective views ~ of a 3D object between which a trilinear relationships exists, and apparatus for employing the trilinear relationship between ~ the views in order to perform visual recognition by alignment.

SUBSTITUTE SHEET (RULE 26) =

W O9-'3~ P~l/U~3~/05697 Further in accordance with a preferred embodiment of the present invention, at least one result of performing the above method is employed in order to perform at least one of the following applications: map making from aerial and satellite photographs and coordinate measurementS in aerospace and shipyard assembly plants, coordinate measurements of industrial parts (CMM), automated optical based inspection of industrial parts, robotic cell alignment, robotiC trajectory identification, 3D
robotic feedback, 3D modelling of scenes, 3D modelling of ob-jects, reverse engineering, and 3D digitizing.
Algebraic computation processes useful for recognition are described below with reference to Figs. 1 - 5.

SUBSTITUTE SHEET (RULE 26~

W 096/3436~ - P~lr~ 7 BRIEF DEscRIpTIoN OF THE DRAWINGS AND APPENDICES

The present invention will be understood and appreciat_ ed from the following detailed description, taken in coniunction with the drawings in which:
~ Fig. 1 is an illustration of two graphs comparing the performance of an epipolar intersection method, shown in dotted line, with the performance of a trilinear functions method, shown in dashed line, in the presence of image noise;
Fig. 2 is a pictorial illustration of two model views of a three-~;m~ional scene as well as a third reprojected view thereof;
Fig. 3 is a pictorial illustration of a reprojection using a trilinear result;
Fig. 4 is a pictorial illustration of a reprojection using intersection of epipolar lines;
Fig. 5 is a pictorial illustration of a reprojection using a linear combination of views method;
Fig. 6 is a simplified functional block diagram of 3D scene reconstruction apparatus, constructed and operative in accordance with a preferred embodiment of the present invention, which is operative to generate a 3D representation of a 3D scene from at least three views thereof;
Fig. 7 is a simplified flowchart illustration of a preferred 3D scene reconstruction method, operative in accordance with a preferred embodiment of the present invention, which is useful in conjunction with the apparatus of Fig. 6;
Fig. 8 is a simplified functional block diagram of image transfer apparatus, constructed and operative in accordance with a preferred embodiment of the present invention, which is operative to generate a novel view of a 3D scene from at least two re~erence views thereof; and Fig. 9 is a simplified flowchart illustration of a preferred image transfer method, operative in accordance with a preferred embodiment of the present invention, which is useful in SUBSTITUTE SHEET (RULE 26~

W 096/34365 PCTrU~G~'~'G97 conjunction with the apparatus of Fig- 8;
Fig. 10 is an illustration of a 3D reconstruction of a shoe which is also illustrated in Fig. 2;
Fig. 11 is a simplified block diagram illustration of a preferred method and apparatus for quality assurance of workpieces; and Fig. 12 is a simplified block diagram illustration of a preferred method and apparatus for generating a digital terrain map.
Also attached herewith are appendices which aid in the understanding and appreciatiOn of one preferred embodiment of the invention shown and described herein, namely:
Appendix A, which is a listing, in "C" language, of a preferred implementation of trilinear computation unit unit 50 and epipolar geometry generation unit 60 of Fig. 6; and Appendix B, which is a listing of a preferred software implementation of 3D reconstruction apparatus constructed and operative in accordance with a preferred embodiment of the present invention.

SUBSTITUTE SHEET ~RULE 25?

WO 96134365 PCrlUS9f ~S6~7 Detailed Description of preferred Embodiments A pin-hole camera, like 3~mm still camera or Video recorder, produces a two-dimensional projection (2D) of the vie~ved three-llim~oncional (3D) world. The resulting image can be analyzed on a geometric and photometric level. The geometric level means the geometric relation between the locations of features (points, lines) in 3D and their respective location in the ~!D image. The photometric level means the radiometric (reflectance properties of the surface, the spectral properties of the light sources illllmin~tin~; the scene, etc.) relation bet~,veen the scene and the lllmino~ity (pixel grey values) in the image.
The 3D world is modeled as a cloud of points and so is the 2D world. In other words, we identify point features in the image that correspond to point features in 3D. There is a geometric relation between the two sets of points. When the camera moves in 3D we get more images of the same 3D world (i.e., of the same cloud of 3D points) and the question of interest is the geometrical relation between the corresponding set of 2D image points and the set of 3D points. In other words, if Pi denotes a set of 3D points, and Pi,p'-,pl,' are three sets of 2D points (across three images) arranged such that points with same index i correspond to the same 3D point Pi, then the image sets alone reveal much information about the 3D
set.
There are two questions of interest: (i) what is necessary to measure in the 2D images (how many points across how many images) in order to recover the location of the 3D
points (reconstruction), (ii) given the image correspondences can we synthesis new views (the problem of image transfer) without explicitly reconstructing the object. The patent deals with both questions.
The general material of 3D-from-2D is an active area in Computer Vision (structure-from-motion and stereopsis), and is the sole engine in the industry of photogrammetry (map making from areal and satellite photographs, coordinate measurements in aerospace and shipyard assembly plants, etc.).
The relationship between 3D points and 2D points can be represented as a 3 x 4 transfor-mation matrix:

p ~' MP, ~vhere p = (:r, y, l)T represellts the 2D point in homogeneous coordinates (i.e., the image SUBSTITUTE SHEET (RULE 2~) W O9613436F, PCTrUS9~0~6~7 pl~Lil~' is the 2D projective plane), where 2,~ are the horizontal and vertical components of tlle point's location in the image (with respect to some arbitrary origin - say the geometric center of the image plane); P is represented by a 4-vector, which means that we view the object space as embedded in the three-~limen~ional projective space. The matrix M is 3 x 4 and represents the camera parameters (for example, the vector Q defined by MQ=0 is the camera's center of projection--the point where all the optical rays converge to). The sign denotes equality up to scale.
Because we are embedding the image and the object into projective spaces (which means that the world can undergo rigid motion and be stretched and sheared) we can transform the object space such that the camera transformation associated with the first view is [I, 0], where I is the identity matrix. For three views we have:

P = [I, 0] P
P ~ [A,v'lP
P --[B,v'lp ~vhere P = (:I:,y,l,k)T, and the image points are p = (:Z~,y,l)T,p~ ',y',1)T and p~ 7 y~7 1 )T Note: what we measure in the images are the locations of p, p', p", the location of P (i.e., the value of k) is not known. A and B are some 3 x 3 matrices (not independent of each other) and v',v" are some 3-vectors together describing the camera parameters of the three views. Note: the camera parameters A, B, v', v" are unknown. In other ~ ords~ given three views with corresponding points Pi,P'i.Pi'. i = 1,...,n, there exists a set of numbers ki and parameters A, B, v', v" such that the equations above hold for all i.
I. TRILINEAR TENSOR
The array of 27 numbers described by:
~ ijk = Vibjk -- Vj aik- i, j, k = 1, 2, 3 where arS, brS are the elements of A, B, and vr7 vr' are the elements of v', v", is referred to as the "trilinear tensor". These numbers are invariant to the particular projective representation of the ~3D and 2D worlds, i.e., they are intrinsic to the three views (this is one of the general properties of tensors that they do not depend on the choice of basis for representation).
NOTE: the array C~ijk provides the foundation for numerous applications. A preferred method for producing the array from image measurements p, p', p" will be described next.

SUBSTITUTE SHEET (RULE 28!

W 096/34365 P~-llUb.'~5~7 A. recoverin~ . from imGge 7tlcasu7ements A corresponding triplet p, p', p" satisfies a number of trilinear relationships. First notation:
We identify vectors and matrices by fi~iing some of the indices while varying others. For example ~ijl is a set of scalars, cYii. is a set of 9 vectors (k varies while i,j remain fixed);
.- is a set of 3 matrices (C~l. 7 c~2.. and o~3..)1 and so forth.
For a given order (i.e., we do not permute the views), there are 9 trilinear functions, that come in sets of 4 linearly independent forms. The coefflcients of these forms are elements of the array C~ijk. For example, the following four forms are linearly independent invariant functions of three views:

~ 3 p ~C ~C Cx33 p + ~ 31 P C~l 1 .P = ~7 y~cxl3 p--y X ~X33 p + 2 ~~32-P ~12 P ~~
~; ~23.P ~ y ~x33.p + y <X3l.p ~'21-P = ~-Y ~'23 P y y o~33.p + Y ~X32-P ~'22-P = ~-The r~m~ining functions are described in equations 5-11 presented herein.
Since every corresponding triplet p, p', p" contributes four linearly independent equations, then seven corresponding points across the three views uniquely determine (up to scale) the tensor C~ijk-NOTE 1: what this means is that if we identify 7 triplets of corresponding points acrossthe three views--say the object is a human face and we track the corner of the nose across three views: its location in the first view will be denoted by (:r,y), the location of the nose in the second view will be denoted by (2',yl) and the location of the nose in the third view iS (T~y~ then each such triplet provides 4 linearly independent equations for the 27 parameters we seek to recover Thus, we obtain 28 equations which is more than sufficient to solve for ~~ijk~
NOT~ 2: There are many ways to solve for linear systems, especially when more equations than unknowns are provided (i.e., when more than 7 corresponding triplets are given), for example least-squares techniclue, and robust methods for removing outliers within least-squares techniques SUESTITUTE SHEET (RU

W 096/34365 PCTnUS~f'0_~7 II. RANI~ 4 C:ONSTRAIi~-T: TENSORI~L CONNECTIONS ACROSS MORE THAN 3 ~IE~VS
Given a set of m > 3 vie~,vs, the various trilinear tensors of subsets of 3 views out of entire set of vie~vs are not independent, and in fact are linearly related to each other. This relation, described below, we call "the rank 4 constraint".
We arrange a tensor of three views a tensor in a 9 x 3 matrix denoted by G, where column j = 1,~, 3 contains the elements ~~ ,j,2,~ 3.j,3-Let the the arbitrary set of m views be arranged as ~,b1,~b2,~bj,j = 3,...,m. Consider them--'~ tensors of views < ~ b2,~,bj > and let the correspondence G matrices be denoted by Gj. Then the concatenation of the Gj matrices into a 9 x 3(m--2) matrix has rank 4 regardless of m (in general, if the Gj matrices were independent, the rank of the concatenation ~vould have been the smaller dimension, i.e., 9 when m > 5).
The ranL;-4-constraint implies that the four largest principle components of the concate-nation matrix represents the geometry behveen views ~ and ~2, in a way that is statically optimal.
A. Application 1: Image Transfer Say we have t~,vo views with corresponding points p',p,', i = 1,...,n. We wish to create a new view of the same object. Given the tensor ~ijk we plug into the trilinear forms the values of Pi, p,' and e~ctract the value of 2/t, y~ regardless of the manner in which C~ . is obtained.
The tensor C~ can be obtained, for example, if we know the locations of p',' for 7 points, e.g., i = 1, ..., 7. Then for every 8th point we can recover p" from the pair p, p' and the tensor B. Application ~: Reconstruction A tensor of three views, or the four largest principle components of the concatenation matrix of m > ~ views (see Section II) can be used to recover the epipolar geometry between two vie~Ys, and from there to be used for reconstructing a 3D model of the scene.
Let Ej = ~j. be three 3 x 3 matrices, i.e., Ej = Crl,j,l7a!l,j,27...~ C~3,j,3. Then this particular permutation of the array of ~27 numbers into three matrices yields the following result:

The three matrices El, E2, E3 are projective transformations (homography matrices) as-sociated with some three planes in 3D space SUBSTITUTE SHEET (RUL~ 25 W O9.31 C~ PCTrUS96/05697 ~ Vllat this means is that there exists three planes 7rj in space, such that if P is coplanar with 7rl (for example) and p, p' are the image locations in the first two views, then El p p'.
Similarly for j = 2, 3.
The "fundamental matrix" F (or "coplanarity constraint" as used in photogrammetrical literature) between the first two views satisfies:
ETF + FTEj = O j = l, 2, 3 a relation that provides a linear system of equations for solving for F.
Similarly, the four largest principle components of the concatenation matrix (Section II) for m > 3 views can be used to solve for F, as follows. Let Ak, k = 1, ..., 4 be the four largest principle components of the concatenation matrix, each arranged in a 3 x 3 matrix (ro-v-wise). Then, AkF+ F Ak = O ~ = 1,2,3,4 which provides a statically optimal way to linearly recover F from m > 3 views.
The "epipolar" point v' can be recovered, either by the relation ETv' = O, or directly from the tensor as follows:
The cross-product of corresponding columns of each two of the matrices out of the three matrices El, E2, E3 provides an epipolar line, i.e., a line that is coincident with v'. Similarly.
if ~ e denote by al, a2, a3 the columns of one E matrix, and by bl, b2, b3 the columns of another E matrix, then ai x bj--bi x aj is an epipolar line for i ~ j. Tal;en together the tensor gives rise to 18 epipolar lines whose intersection defines the epipole Z ~. The point of intersection could be found by, for example, a Ieast-squares method.
Similarly, the epipolar point v' can be recovered from Al, ..., A4, the four largest principle components concatenation matri~c, as follows. let al, a2, a3 be the the columns of matrix Al, and by bl, b2, b3 the columns of A~, then ai x bj--bi x a SUBSTITUTE SHEET ~RULE 26 W O9'13~6~
PCTrUS~056~7 is an epipo]~r line for i ~ j, and ai x 6i pro-ides another set of three epipolar lines. By selecting in this ~vay two out of the four matrices A~ ..., A4 we get 24 epipolar lines which provide a redundant set of equations for solving for v'.
The projective model of the 3D scene follows from F and v', for example: the 3 x 4 matrix [[v']F,v'] is a camera transformation from 3D coordinates to the image coordinates of the second view. In other words, for every matching pair p, p' of points in first and second view, respectively, we have:
P --[v ] Fp + kv, thus the coordinate vector ~ , l, k] provides a 3D (projective) model of scene.
NOTE l: the permutation '~i-. yields also three homography matrices Wl,~2, W3 of three other planes. The homography matrices are from first two third view, i.e., Wlp ~--p" for p, p"
coming from P coplanar with the plane associated with Wl. Thus, analogous to the use of the matrices El, E2, E3, we can recover the fundamental matrix F" and epipolar point v"
associated with the epipolar geometry between the first and third view.

SUBSmUr~ ~ llIE ~C) W 096/34365 PCTrUS9C~OJ697 ACHIEVING IMAGE CORRESPONDENCES ACROSS 3 VIE~VS

One of the important steps in manipulating 3D scenes from 2D imagery is the step of obtaining a reliable set (as dense as possible) of m;ltrhing points across the views at hand.
The current state-of-the-art use correlation techniques across the image intensities of t-vo views together with geometric information, such as the epipolar geometry.
With the tensor one can extend the correspondence method to take advantage of three views together and without assuming camera calibration. Given an initial set of corre-spondinO points across three views one recovers the trilinear tensor. Then, the trilinear tensor provides a constraint for using correlation methods to accurately locate the matching pOillts. This can be done as follows.
Let E" E;2, E3 denote the arrangement c~.;. of the tensor and W1, W2, W3 denote the ar-rangement c~i. of the tensor. We first align all three images such that the displacements between matching points become only along epipolar lines. This is done as follows.
Let E = ~j pjEj where the coefficients pj are the solution of the linear least-squares con-dition ~p~, ~--' P'k for all initial matching points pk, p'k in first and second image, respectively.
Similarlv, let W = ~i~iWi where the coefflcients ~i are the solution of the linear least-squares condition Wpk --Pk' for all initial matching points pk, pk' in first and third image, respectivelv. The process of "warping" image two with the transformation E-l and image three witll the transformation W-l produces two new images denote by 2' and 3', such that tlle matching points across images 1,2' are along epipolar lines and the matching points across images 1, 3' are along epipolar lines.
The epipolar lines can be derived from the tensor as presented herein (Section II.B in detailed description of preferred embodiment). Let w' and w" be the direction vectors of the epipolar lines between 1, '>' and 1, 3', respectively. Hence, if p = (2, y), p' = (2', y'), p" =
(2", y") are matching points in images 1, 2', 3' respectively, then p' = p+c~u7' and p" = p+,~w"
for some coefficients c~"B. Note that w', w", ~"B are quantities that change with the location (2~ y)-The trilinear equations, (3)-(6) presented herein, provide a geometric constraint on the coef~icients c~"B, i.e., for each arbitrary value of cr, the value of ,B is then uniquely determined, and vice versa. In order to recover both ~"B we use the image intensities of l. ~', 3' as follows.

SUBSTITUTE SHEET (RULE 26) W 096/34365 PCTrUS9''05~7 The coefficients c~,~ satisfy the "constant brightness equation":
a~ IY)TWr + It12 = O
,Iy)Tw"+ Il3 = 0 where (I~,I3,) is the gradient vector at location p = (:l:,y) in the first image, and Il2 is the temporal derivative between images 1,2', and Itl3 is the temporal derivative between images 1, 3' (all derivatives are discrete a~r-~x;...~tions). From the trilin~r equations, (3)-(6) presented herein, we can obtain f(cY"B) = O where the coefiicients of the function f() are elements of the tensor. Thus one obtains a least-squares solution for c~ and ~ that sati~fies both geometric an photometric constraints.
The entire procedure can be done hierarchically along a Laplacian Pyramid as described in "J.R. Bergen and R. Hingorani, Hierarchical motion-based frame rate conv~L~ion"; Technical report~ David Sarnoff Research Center, 1990.

- SUBStlTUTE SHEET (RULE 26) W 096/34365 PCTrU~3~'0J697 Reference is now made to Fig- 6 which is a simpli~ied functional block diagram of 3D scene reconstruction apparatuS, constructed and operative in accordance with a preferred embodi-ment of the present invention, which is operative to generate a 3D representation of a 3D scene from at least three views there-of.
The apparatus of Fig. 6 includes apparatus for provid-ing at least 3 digital images of a 3D scene or object from at least 3 respective viewpoints, such as a CCD camera 10 which operates from 3 different viewpoints, or such as a film camera 20, which may be airborne, associated with a scanner 30 (such as a Zeiss PS-l which is operative to digitize the images generated by film camera 20.
The at least 3 digital views are fed to a matching point finder 40 which is operative to identify at least 7 and preferably a multiplicity of triplets of matching points from the 3 digital views. "Matching points" from different views are points or locations which correspond to a single location in the real, 3D world. These points may, for example, be identified manually. For aerial photographs, commercially available matching point software may be employed such as the Match-T package mar-keted by Inpho in Stuttgart, which performs a correspondence function.
Conventional methods for finding matching points are described in Wang, H. and Brady, J. M, "Corner detection: some new results", IEEE colloquium Digest of Systems Aspects of Machine Perception and vision", Londone 1992, pp. 1.1 - 1.4.
A trilinear tensor computation unit 50 receives the matching point triplets and computes a trilinear tensor representing the geometric relationship between the three views.
The trilinear tensor is then employed to generate a 3D
representation of a 3D scene. According to one embodiment of the present invention, the trilinear tensor is employed by an epipo-lar geometry generation unit 60 to compute an epipolar geometric representation of two of the three views. Subsequently, 3D
representation generation unit 70 generates the 3D representation-SUBSTITUTE SHEET (RULE 26~

W 096/34365 PCTAUS9~3~97 -of the scene or object from the epipolar geometric representation output of unit 60, as described in more detail in:
a. Faugeras, 0. D. "What can be seen in 3 dimensions with an uncalibrated stereo rig?", Proceedings of the European Conference on Computer Vision, pages 563 - 578, Santa Margherita Ligure, Italy, June 1992.
b. Hartley, R. et al, "Stereo from uncalibrated cameras ~
Pro~;ngs IEEE Conf. on Computer Vision and Pattern Recognition, 761 - 764, Champaign, IL, June 1992.
c. Shashua, A. "Projective structure from uncalibrated images: structure from motion and recognition. IEEE Transactions on PAMI, 16(8): 778 - 790, 1994.
A preferred implementation of units 50 and 60 is de-scribed, in "C" computer language form, in Appendix A. A
preferred implementation of unit 50 is described from page 1 of Appendix A until toward the end of page 4 thereof. A preferred implementation of unit 60 is described from the end of page 4 of Appendix A until the middle of page 8. Subroutines and statisti-cal procedures which are useful in understanding the above mate-rial appear from page 8 of Appendix A onward.
The 3D representation, including 3D information representating at least a portion or an aspect of the 3D scene or object, may be employed by a laser computer printer to generate a new view of the object or scene. Alternatively, conventional CAD
(computer aided design~ software in conjunction with a conventional plotter may be employed to generate a new view of the object or scene. The CAD software may also be operative to compare 2 CAD files in quality assurance applications.
Reference is now made of Fig. 7 which is a simplified flowchart illustration of a preferred 3D scene reconstruction method, operative in accordance with a preferred embodiment of the present invention, which is useful in conjunction with the apparatus o~ Fig. 6. Fig. 7 is generally self-explanatory. The 3 views of the scene or object can be digital, if they are accessed from a digital archive or generated, e.g. by a digital camera. If they are not digital, they are scanned or otherwise digitized.

SU~STITUTE SHEET !RULE ?~' W 096/34365 PCTNS9GJ'~_C~7 Any suitable conventional or other formats may be employed. For example, the Silicon Graphics's Inventor format may initially be employed for the 3D representation- The Inventor format may be converted into Postscript in order to print a new view of the 3D representation.
As shown, the 3D representation of the scene is useful ~ in performing a wide range of activities such as 3D measurementS
of the scene or object, generation, e.g. as a printout, of a new view of the scene or object, and quality assurance comparisons in which the generated 3D representation of the object or scene is compared to a desired object or scene or a desired 3D
representatiOn thereof, using conventional methods.
Reference is now made to Fig. 8 which is a simplified functional block diagram of image transfer apparatus, constructed and operative in accordance with a preferred embodiment of the present invention, which is operative to generate a novel view of a 3D scene from at least two reference views thereof. The apparatus of Fig. 8 is similar to the apparatus of Fig. 6. Howev-er, in Fig. 8, a novel view of a 3D scene is directly generated from only at least two reference views thereof, preferably with-out generating a 3D representation intermediate.
In Fig. 8, geometric information regarding the two reference views and the desired novel view is employed to gener-ate a trilinear tensor representing the geometric relationship between the three views. Typically, at least 7 triplets of loca-tions in the novel view may be identified, e.g. manually, which correspond to 7 locations respectively in each of the at least two reference views. Typically, at least some infromation regard-ing the novel view is available. For example, if it is desired to update a GIS (geographic information system) year-old Yiew, based on at least two new reference views of the same area. It is typically possible to identify at least 7 locations in the year-old view which correspond to 7 locations in the two reference views and which can be assumed will still exist in the soon-to-be-generated current version of the year-old view.
The novel view is typically generated by computing a SUBSTITUTE SHEET (RULE 26) W 096/34365 PCTrUS9~'05697 multiplicity of novel view locations each corresponding to dif-ferent first and second corresponding locations in said first and second reference views respectively based on said first and second corresponding locations and said trilinear tensor. For example, the matching point finder 140 may generate a multiplicity of pairs of matching points from the two reference views, say 1000 such pairs. For the first 7 pairs, a user may manually indicate 7 matching points in the novel view.
Once the trilinear tensor is generated by trilinear tensor computation unit 150, the coordinates of Matching Point Pairs 8 - 1000 may be plugged into the trilinear tensor, as shown in Fig. 9, in order to generate coordinates of matching points 8 - 1000 in the novel view.
The novel view thus generated may, for example, be compared to the same view as seen a year before, in order to identify differences in the scene which took place in the course of the year.
Reference is now made to Fig. 9 which is a simplified flowchart illustration of a preferred image transfer method, operative in accordance with a preferred embodiment of the present invention, which is useful in conjunction with the appa-ratus of Fig. 8. Fig. 9 is generally self-explanatory.
In any of the embodiments described herein, if more than 3 views are provided, "intermediate" tensors may be computed for each 3 views. Then, a representative tensor may be computed based on the relationships between these "intermediate" tensors.
Fig. 10 is an illustration of a 3D reconstruction of a shoe which is also illustrated in Fig. 2. Fig. 10 was generated by finding matching points and reconstructing their 3D locations.
Next the coordinates are processed by CAD software to generate the surface shown in Fig. 10.
Fig. 11 is a simplified block diagram illustration of a preferred method and apparatus for ~uality assurance of workpieces. Fig. 11 includes an array 200 of 3 CCD cameras 210 which are aimed at a single location, so as to yield three perspectives of that location. The CCD cameras are attached to a SUBSTITUTE SHEET (RULE 26) - - -W 096/34365 PCTnUS~<~C97 robot arm 212 and therefore can move relative to a workpiece 214 arriving along a conveyor belt 220 in accordance with suitable instructions from a controller 224.- ~ -When a workpiece enters the field of view of the CCD
~ cameras 210, the conveyor belt 220 typically pauses to allow substantially the entirety of the surface area of the workpiece to be imaged by the cameras 210. The controller 224, via robot arm 212, moves the camera array 200 around the object such that, typically, almost its entire surface area is imaged. For example, the camera array 200 may be moved through 10 different positions around the object, and at each position, each of the 3 CCD
cameras images the workpiece. The number of positions employed depends on the complexity of the workpiece.
This process yields a plurality of image triplets, eachimage triplet including three digital images of the same portion of the workpiece, from 3 respective perspectives. For each image triplet, the corresponding position of the array 200 and of each of the cameras 210 may be computed, based on the robot arm's location, which is known, and using hand-eye calibration.
Each image triplet is processed by units 240, 250, 260, 270 and 290 which may be similar to units 40, 50, 60, 70 and 90, respectively, of Fig. 6.
The CAD model information generated by CAD S/W 290 from each image triplet (each probe location) is stored in a suitable memory 300. A computation unit 310 is operative to integrate the multiplicity of probe locations corresponding to the multiplicity of positions of CCD camera array 200, into a single coordinate system. The necessary coordinate transformations are computed by inverting the transformations which define the CCD camera array's motion.
A computational unit 320 compares the output of unit 310 to a reference CAD model and computes differences therebetween. These differences are compared, in a computational unit 330, to accepted tolerances.
The apparatus of Fig. 11 is also suitable for 3D digi-tization applications for reverse engineering or CAD (computer SUBSlTiUI~ ~E~ (RUL~ 26) -W 096/34365 PCT/u~ G~7 aided design purposes)~
Fig. 12 is a simplified block diagram illustration of a preferred method and apparatus for generating a digital terrain map, e.g. in order to update municipal maps, to detect illegal structures, to serve a car navigation system, or even to map a microscopic object such as an integrated circuit.
An airborne CCD camera 334 is flown over a scene for which it is desired to generate a digital terrain map. The camera 334 generates 3 digital images of the scene from 3 respective perspectives. The 3 images are processed by units 340, 350, 360 and 370 which may be similar to units 40, 50, 60 and 70 of Fig.
6.
A surface interpolation procedure is performed by a surface interpolation unit 380, on the output of 3D representa-tion generation unit 370. A suitable surface interpolation method is described in Grimson, W. E. L, "A computational theory of visual surface interpolation", Proc. of the Royal Soc. of London B, 298, pp. 395 - 427, 1982.
Finally, perspective distortions are removed from the output of surface interpolation unit 380 by an orthophoto genera-tion unit 390, resulting in an orthophoto (planimetric map). A
suitable method of operation for unit 390 is described in Slama, C. C., "Manual of Photogrammetry", American Society of Photogram-metry, 1980.
The present invention is also useful for visualization of a 3D object, particularly for entertainment, animation or educational applications. A camera array such as array 200 of Fig. 11 circles around an object to be visualized and images substantially the entire surface area which it is desired to display to a user. For example, the camera array may image the object ~rom each of 200 positions surrounding the object.
Synthetic images may then be generated for positions other than the 200 above-mentioned positions. A desired position may be indicated by a user, e.g. by means of a joystick. The apparatus of the present invention may be used to generate a synthetic image for that position.

SUBSTITIJTE SHEET (RULE 26~

W 096/34365 PCTAUS9G~'~r~7 conventional driving simulation games employ synthetic backgrounds, however, the present invention may be employed to provide a driving simulation game with a real background. To do this, an array of at least 3, and preferably 5 - 10 cameras is moved within a desired scene such that substantially the entirety of the scenes can be captured by the camera array from at least 3 and preferably more different perspectives. For example, the scene may be captured from each of approximately 1000 positions of the camera array. New views are then generated, in accordance with the present invention, in order to accomodate a user's need for new views as indicated, e.g. by a joystick.
Algebraic functions useful for recognition are now described, based on an article entitled "Algebraic functions for recognition", to be published in IEEE, Transactions on PAMI.

SuBsTlTuTE SHEET (RIJLE 26~

W 096/34365 P~1/U~ OSGg7 Algebraic E'uncti~ s For Recognition I. INTRODUCTIO~
~ e establish a general result about algebraic connections across three perspective vie~vs of a 3D scene and demonstrate its application to visual recognition via alignment. ~Ve sho~
that in general, any three perspective views of a scene satisfy a pair of trilinear functions of image coordinates. In the limiting case, when all three views are orthographic, these functions become linear and reduce to the form discovered by [3~]. Using the trilinear result one can manipuiate views of an object (such as generate novel views from two model vie~ s) without recovering scene structure (metric or non-metric), camera transformation, or even the epipolar geometry. Moreover, the trilinear functions can be recovered by linear methods witl1 a minim~l configuration of seven points. The latter is shown to be new lower bound on the minimAI configuration that is required for a general linear solution to the problem of re-projecting a 3D scene onto an arbitrary novel view given corresponding points across t~vo reference views. Previous solutions rely on recovering the epipolar geometrv which, in turn, req~lires a minirn~l configuration of eight points for a linear solution.
The central results are contained in Theorems 1, 2 and 3. The first theorem states that the varietv of views 7,~ of a fixed 3D object obtained by an uncalibrated pin-hole camera satisfy a relation of the sort F(~ 2) = O, where ~ b2 are two arbitrary views of the object.
and F has a special trilinear form. The coefflcients of F can be recovered linearly ~vithout establishing first the epipolar geometry, 3D structure of the object, or camera motion. The auxiliary Lemmas required for the proof of Theorem 1 may be of interest on their own as they Qstablish certain regularities across projective transformations of the plane and introduce ne~
vie~v invariants (Lemma 4) .
Theorem '~ addresses the problem of recovering the coefficients of the trilinear functions in the most economical wat-. It is shown that among all possible trilinear functions across three views, there exists at most four linearly independent such functions. As a consequence, the coefficients of these functions can be recovered linearly from seven corresponding points across three vie~vs.
Theorem 3 is an obvious corollary of Theorem 1 but contains a significant practical aspect.
It is s hown that if the vie~vs ~ !'2 are obtained by parallel projection, then F reduces to a .special bilinear form--or. equivalently, that any perspective view ~ can be obtained by a SUBSTITUTE SHEET (RULE 26) W 096134365 ~ PCTnUS96/05697 r~tional lillear functioll of t-vo orthographic views. The reduction to a bilinear form implies that simpler recognition schemes are possible if the two reference views (model views) stored in memory are orthographic.
These results may have several applications (discussed in SectioIl ~v I), but the one empha-sized throughout this manuscript is for the task of recognition of 3D objects via alignn1ent.
The alignment approach for recognition ([37, 16], and references therein) is based on the result that the equivalence class of views of an object (ignoring self occlusions) undergoing 3D rigid, af~ine or projective transformations can be captured by storing a 3D model of the object, or simply by storing at least two arbitrary "model" views of the object--assum-ing that the correspondence problem between the model views can someho-v be solved (cf.
[~l. a. .33]). During recognition a small number of corresponding points bet-veen the novel input view and the model vie-vs of a particular candidate object are sufficient to ' re-project"
the model onto the novel vie-ving position. Recognition is achieved if the re-projected image is successfully matched against the input image. We refer to the problem of predicting a novel view from a set of model views using a limited number of corresponding points, as the problem of re-projection.
The problem of re-projection can in principal be dealt with via 3D reconstruction of shape and camera motion. This includes classical structure from motion methods for recovering rigid camera motion parameters and metric shape [36, 18, 35, 14, 15], and more recent methods for recovering non-metric structure, i.e., assuming the objects undergo 3D affine or projective transformations, or equivalently, that the cameras are uncalibrated [17, 25, 39, 10, 13. :30~. The classic approaches for perspective views are known to be unstable under errors in image measurements, narro-v field of view. and internal camera calibration [3, 9, 1'~], and therefore, are unlikely to be of practical use for purposes of re-projection. The non-metric approaches, as a general concept, have not been fully tested on real images, but the methods proposed so far rely on recovering first the epipolar geometry--a process that is also kno~-n to be unstable in the presence of noise.
It is also L;nown that the epipolar geometry alone is sufficient to achieve re-projection by means of intersecting epipolar lines [24, 6, S, 26, '23, 11] using at least eight corresponding points across the three views. This, however, is possible only if the centers of the three cameras are non-collinear--which can lead to numerical instability unless the centers are far from collinear--and anv objeGt point on the tri-focal plane cannot be re-projected as w og-~3~ . PCTrUS9f~0~C~7 ~-ell ~urthermole, as ~vith the non-metric reconstructiorl Illethods, ol~taining the epipolar geometry is at best a sensitive process even when dozens of corresponding points are used and ~vith the state of the art methods (see Section V for more details and comparative analvsis ~vith simulated and real images) For purposes of stability, therefore, it is worthwhile exploring more direct tools for achiev-ing re-projection For instance, instead of reconstruction of shape and invariants we ~vould liL;e to establish a direct connection bet-veen views expressed as a functions of image coor-dinates alone--which we call "algebraic functions of views" Such a result was established in the orthographic case by [3~] There it was shown that any three orthographic views of an object satisfy a linear function of the corresponding image coordinates--this ~ve will sho~- here is simply a limiting case of larger set of algebraic functions, that in general have a trilinear form With these functions one can manipulate views of an object, such as create ne-~ vie~vs, without the need to recover shape or camera geometry as an intermediate step --all ~vhat is needed is to appropriately combine the image coordinates of two reference views Also with these functions, the epipolar geometries are intert-vined, leading not only to absence of singularities, and a lower bound on the minim,~..l configuration of points, but as ~e shall see in the e~cperimental section to more accurate performance in the presence of errors in image measurements Part of this work (Theorem 1 only) was presented in concise form in [31]

II. NOTATIONS
We consider object space to be the three-dimensional projective space P3, and image space to be the two-dimensional projective space p2, Let q~ c P3 be a set of points standing for a 3D object, and let ~6i c p2 denote vie-vs (arbitrary), indexed by i, of q~ Given t-vo cameras with centers located at 0, O' ~ P3, respectively, the epipoles are defined to be at the intersection of the line 00/ with both image planes Because the image plane is finite, we can assign, without loss of generality, the value 1 as the third homogeneous coordinate to everv ob~e7~ed image point That is, if (x,y) are the observed image coordinates of some point (-~,-ith respect to some arbitrary origin--say the geometric center of the image), then p = (x y 1) denotes the homogeneous coordinates of the image plane Note that this convention ignores special views in which a point in ~ is at infinity in those ~-iews--these sillgulAr cases are not modeled here SUBSTITUTE SHEET (RULE 2 W 096/34365 PCT/U~3r/Or6~7 Since we ~vill be ~vorking with at most tllree views at a time, we denote the relevant epipoles as follo~vs: let v ~ and v' ~ 2~2 be the corresponding epipoles between ~iews ~ 2, and let r~ ~ U" and v" ~ ~63 the corresponding epipoles between views ~"2~3- Likewise, corre-sponding image points across three views will be denoted by p = (1:,y, l),p' = (2',Y',1) and p" = (*",y", 1). The term '~image coordinates" will denote the non-homogeneous coordinate representation of p2, e.g., (:~, y), (:z~', y'), (2",Y") for the three corresponding points.
Planes will be denoted by ~ri, indexed by i, and just 7r if only one plane is discussed. All pl~nes are assumed to be arbitrary and distinct from one another. The symbol ~ denotes equality up to a scale, GLn stands for the group of n x n matrices, and PC;Ln is the group defined up to a scale.
III. THE TRILINEAR FORM
The central results of this manuscript are presented in the following two theorems. The r~mAining of the section is devoted to the proof of this result and its implications.
Theorem 1 (l~ilinearity) Let 7J~ 2, ~63 be three arbitrary perspective views of some ob-ject, modeled by a set of points in ~D. The image coordinates (~,y) ~ ?~I, (2',Y') ~ ~b2 and (~",y") ~ 2~3 of three corresponding points across three views satisfy a pair of trilinear equations of the following form:

(Q~ll, + a'2y + ~3) + 2~ 1~ (~41~ + C~5Y + ~6) +
~ + ~Y + ~9) + ~10~ + ~llY + ~12 = O, and Y"(~ + ~2Y + ~3) + Y~ 4 ~ + ~5Y + ~6) +
~%'(,~ + ,B8y + Bg) + ~lo~ + ~lly + ~l2 = 0, where the coefficients c~j"Bj, j = 1, ...,12, are ~ed for all points, are uniquely de~S:ned up to an overall scale, and cYj = ~j! j = 1~ ..., 6.
The following A~iliAry propositions are used as part of the proof.
Lemma 1 (Auxiliary- Existence) Let A ~ ~'GL3 be the projective mapping (homogra-plLy) 2~, ~ U2 due to some plane 7r. L,et A be scaled to satisfy pO Apo + v'. u)here pO ~ ~1 and pO ~ ~-2 are correspondirlg points coming from an arbitrary point PO ~ en, for any SUBSTITUTE SHEET (RULE 26~

W 096/34365 PCT/U~5G~'~ 697 ~olles~onding pair p ~ ~1 and p' ~ ~l2 comingfrom an ar6itrary point P ~ P~3, ue haLe p' A p + kv'.
~1~e coef~icient k is independent of ~,b2, i.e., is invariant to the choice of the second r,ielc.
The lemma, its proof and its theoretical and practical implications are discussed in detail in [~, 32]. Note that the particular case where the homography A is afflne, and the epipole V/ is on the line at infinity, corresponds to the construction of affine structure from t~vo orthographic views [17]. In a nutshell, a representation ~o of P3 (tetrad of coordinates) can always be chosen such that an arbitrary plane ~T is the plane at infinity. Then, a general uncalibrated camera motion generates representations 1Z which can be shown to be related to ~0 by an element of the affine group. Thus, the scalar k is an affine in~ariant ~ ithin a projective framework, and is called a relative af~ne invariant. A ratio of two such invariants, each corresponding to a different reference plane, is a projective invariant [:32].
For our purposes, there is no need to discuss the methods for recovering k--all ~ve need is to use the existence of a relative affine invariant k associated with some arbitrary reference plane 7~ which, in turn, gives rise to a homography A.
Definition 1 Homographies Ai ~ PGL3 from ~ i due to the same plane ;r, are said to be scale-compatible if they are scaled to satisfy Lemma 1, i.e., for any point P ~ ~ projecting onto p ~ ~1 and p~ i, there e3~ists a scalar k that satisfi~es pi ~ Aip + kvi, for any vieu~ here vi ~ is the epipole with ~1 (scaled arbitrarily).
Len1ma 2 (Auxiliary--Uniqueness) Let A, A' f~ PGL3 be tu,~o homographies of u~l ~
r d7le to planes 7r1,~2, respectively. Then, there e~ists a scalars, that satis~es the equation:
A--sA' = [~v'"6'v',~rv'], for some coe~cients c~ y.
Proof: Let q ~ ~bl be any point in the first view. There exists a scalar Sq that satisfies v' '~ Aq--sqA'q. Let H = A--sqA', and we have Hq ~--' v'. But, as shown in [29], Av v' for any homography ~ due to any plane. Therefore, Hv ~--' v' as well. The mapping of t~o distinct points q, v onto the same point v' could happen only if H is the homography due SU~II~vl~SHttl (RUIE26) W O~f1~ PCTrUS~ SC~7 to tlle meridian plane (coplanar with the projection center 0), thus Hp ~ v' for all p ~
and Sq iS a fixed scalar s~ The latter, in turn, implies that H is a matrix whose columns are multiples of v'. O
Lelnnla 3 (Auxiliary for Lemma 4) Let A, A' ~ PGL3 be homographies from ~ 72~ due to distinct planes lr1. 7~2, respectively, and B, B' ~ PGL3 be homographies from 7J~ 73 due to 7r1, 7r2, respectively. Then, A' = AT for some T ~ PGL3, and B = BCTC-1, uhere Gu v.
Proof: Let A = A2-lAI, where Al,A2 are homographies from ~ '2 onto ~rl, respectively.
Similarly B = B2-lBI, where Bl,B2 are homographies from ~ 3 onto ~rl, respectively. Let Al ~ = (cl, c2, c3)T, and let C ~ Al 1diag(cl, c2, c3)Al . Then, Bl AlC-l, and thus, we have B B2-lAIC-l. Note that the only difference between A1 and Bl is due to the different location of the epipoles v, v, which is compensated by C (Cv v). Let El ~ PGL3 be the homography from ~,l'l to 7r2, and E2 ~ PGL3 the homography from 7r2 to 7~1. Then with proper scaling of El and E2 we have A' = A2-lE2El = AAI IE2EI = AT, and with proper scaling of C we have, B' = B2-lE2EIC-l = BCAI IE2ElC-I = BCTC-I.
[1 Lemma 4 (Auxiliary--Uniqueness) For scale-compatible homographies, the scalars S,Q,/~', / of Lemma 2 are invariants inde~ed by ~ 177r177r2. That is, given an arbitrary third vieu~ ~'3, let B,B' be the homographies from ?1~ 3 due to 7r1, 7r2, respectiuely. Let B be scale-compatible ~with A, and B' be scale-compatible u~ith A'. Then, B--sB' = [~v""(3v",-rv"].
Proof: ~;e show first that s is invariant, i.e., that B--sB' is a matrix whose columns are multiples of v". From Lemma ~, and Lemma 3 there exists a matrix H, whose columns are multiples of v', a matrix T that satisfies A' = AT, and a scalar s such that I--sT = A-l H.
.~fter m~lltiplving both sides bv BC, and then pre-multiplying by C-l we obtain B--sBCTC-I = BCA-I HC-I .

St)BSTlTUTE S~lEET ~RULE 26~

W O~-'3~65 PCTrUS9-'0_6~7 l'rom Lemma 3, we ha~e B' = BGTC~-I. The matrix 4-l .FI has columns ~vhich are multiples of v (because A~lo' v ), GA-lH is a matrix whose columns are multiple of v, and BCA-I H
is a matrix whose columns are multiples of v". Pre-multiplying BCA-lH by C-l does not change its form because every column of BCA-I HC-l is simply a linear combination of the columns of BCA-lH. As a result, B--~B' is a matrix whose columns are multiples of v".
Let H = A--sA' and H = B--sB'. Since the homographies are scale compatible, we have from Lemma 1 the existence of invariants k, /~~' associated with an arbitrary p ~ 2~.l, where k is due to 7rl, and k' is due to 7r2: p' ~' Ap+kv' ~' A'p+k'v' and p"--Bp+kv" B'p+k'v".
Then from Lemma ~ we have Hp = (5k'--k)v' and Ep = (sl-'--k)v". Since p is arbitrary, thiS could happen only if the coefficients of the multiples of v' in H and the coefficients of the multiples of v" in H, coincide. [1 Proof of Theorem: Lemma l provides the e~istence part of theorem, as follows. Since Lemma 1 holds for any plane, choose a plane 7rl and let A, B be the scale-compatible ho-mographies ~ 2 and ~ 3, respectively. Then, for every point p ~ ~l, withcorresponding points p' ~ ~2,P" ~ ~3, there exists a scalar k that satisfies: p' Ap + kv', and p" ~ Bp + kv". We can isolate k from both equations and obtain:
v, _s v~ v,--y . ~ 9 v ---- v~
k ( ~aa--G~ p (y'aa--a~rp (-~a~--y~al)~p~ ( ) v~ ---- v~ _ v., --y v~ y v~ --= v"
k = (,~ba--b~ j~p -- (y~ba - b~)rp (-~b2--y~b~)rp~ (2) where bl,b2,b3 and al,a2,a3 are the row vectors of A and B and v/ = (vl.v2,v3), v" =
(Vl/~ V2/~ V3/). Because of the invariance of k we can equate terms of (1) with terms of (2) and obtain trilinear functions of image coordinates across three views. For e~cample. bv equating the first two terms in each of the equations. we obtain:

~ lb3 --v3al)Tp--~r .r (v3b3 --v3a3)Tp +
~ (V3bl --vla3)Tp-- (L~bl --vlal)Tp = O, (3) In a similar fashion after equating the first term of (1) with the second term of ( ), we obtain:

y (vl b3 --v3'al )Tp--y ~ (v3b3--v3~a3)Tp +
3b2--V2a3) p--(~'lb~ --v2al)Tp = o-Both equations are of the desired form, with the first si~; coefficients identical across botl ecluatioll~ .

SU~SmUTE SHEET ~R~I~ E 26~

W 096/3436~ PCT/U~ 5G~7 The q-lestion of uniqueness arises because Lemma 1 holds for ~n~ plane If we choose a different plane, say ~r2, with homographies A',B', then we must show tllat the new homo-graphies give rise to the same coefficients (up to an overall scale). The parenthesized terms in (3) and (4) have the general form: vibj--v~!ai, for some i and j. Thus, we need to show that there exists a scalar s that satisfies v~!(ai--sa'i) = vi(bj--sbj).
Tllis, however, follows directly from Lemmas 2 and 4. 0 The direct implication of the theorem is that one can generate a novel ~,-iew (~3) by simplv combining two model views (~ '2)- The coefficients ~j and p; of the combination can be 1ecovered together as a solution of a linear system of 17 equations (24--6--1) given nine corresponding points across the three views (more than nine points can be used for a least-squares solution).
In the ne~ct theorem we obtain the lower bound on the number of points required for solving for the coefficients of the trilinear functions. The e~istence part of the proof of Theorem 1 indicates that there exists nine trilinear functions of that type, with coefficients having the general form vibj--v't!ai. Thus, we have at most 27 distinct coefficients (up to a uniform scale), and thus, if more than t~vo of the nine trilinear functions are linearly independent, we mav solve for the coefficients using less than nine points. The ne~t theorem shows that at most four of the trilinear functions are linearly independent and consequently seven points are sufficient to solve for the coefficients.
Theorem 2 There e~ists nine distinct trilinear forms of the type described in Theorem 1.
of which at most four are linearly independent. The coe~cients of the four trilinear forms can be recovered linearly with seven corresponding points across the three views.
Proof: The e~istence of nine trilinear forms follow directly from (1) and (2). Let C~ij =
vibj--~!ai. The nine forms are given below (the first two are (3) and (4) repeated for convenience):

2 o~l3p--2 2 ~x33p + ~ ~31P-- CYllp = ~7 y ~13p--y ~ 33p + 2 ~x32P -- ~l2P = ~.
<x23p _ ~ y ~33p + y~a 3 p _ ~T o Y ~23P--Y Y ~33P + Y ~32P C~22P = ~- (6) W 096/34365 PCT/u'-5~Q~97 Y ~ ~31P '~ '~ ~32P + ~ ~12P y ~IIP = ~ (/) T ~ ~cyT p + ~"~X22 p -- Y ~'21P
~ y C~l3p ~: ~ ~23P + ~ ~21P Y ~llP = ~~ (9) Y Y ~'13P Y ~ '~23P + ~ 22P Y CJ~I2p = o, (10) ~ y ~~12P ~ ~ ~Y22P + y :~ <X2lP Y Y cxllp _ o, (11) For a given triplet p, p', p" the first four functions on the list produce a 4 x 2~ matrix. The ranl; of the matrix is four because it contains four orthogonal columns (columns associated ~vith ~ Xl2~C~2l and C~22), therefore these functions are linearly independent. Since we ha~,-e ~I coefficients, and each triplet p, p', p" contributes four linear equations. then seven corresponding points across the three views provide a sufflcient number of equations for a linear solution for the coefflcients (given that the system is determined up to a common scale, seven points produce two extra equations which can be used for consistency checking or for obtaining a least squares solution).
The r~ ning trilinear forms are linearly spanned by the first four, as follows:
(7) = y"(3)--:~"(4) (8) = y"(5)--~"(6) (9) = y'(3)--2~'(5) (10) = y'(4)--:1:r(6) (1 1) = y"y'(3)--:~"y'(4) + ~ 'X'(6)--y"l:'(5) where the numbers in parenthesis represent the equation numbers of the various trilinear functions. O
Tal;en together, both theorems provide a constructive means for solving for the positions ~", y" in a no~,-el view given the correspondences p, p' across two model views. This process of generating a novel view can be easily accomplished without the need to e~;plicitly recover structure, camera transformation, or even just the epipolar geometry--and requires fewer corresponding points than any other known alternative.
The solution for :r", y" is unique without constraints on the allowed camera transforma-tions. There are, however, certain camera configurations that require a different set of four trilinear functions from the one suggested in the proof of Theorem 2. For e~ample, tlle set of equations (5), (6),(9) and (10) are also linearly independent. Thus, for e~ample, in case SUBSTIfUTE SHEET (RULE 26) W 096/34365 P~-l/u~ 5cg7 el and t3 vanish simultaneously, i.e., ~' (O,l.O). then that set sho-lld be used instead.
Similarly, equations (3), (~).(9) and (lO) are linearlv independent, and should be ~Ised in case ~ ' (l, O, O). Similar situations arise with v" (l, O, O) and v" ~ (O, l, O) which can be dealt by choosing the appropriate basis of four functions from the si~c discussed above.
.~ote that we have not addressed the problem of singular configurations of seven pOillts. For e~;ample, its clear that if the seven points are coplanar, then their correspondences across the three views could not possibly yield a unique solution to the problem of recovering the coefficients. The matter of singular surfaces has been studied for the eight-point case nec-essary for recovering the epipolar geometry [l9, 14, 72]. The same matter concerning the results presented in this manuscript is an open problem.
~ Io~,-ing away from the need to recover the epipolar geometry carries distinct and significant advantages. To get a better idea of these advantages, we consider briefly the process of re-projection using epipolar geometry. The epipolar intersection method can be described succinctly (see [ll]) as follows. Let Fl3 and F23 be the matrices ("fundamental" matrices in recent terminology [10]) that satisfy p"Fl3p = O, and p"F23p' = O. Then, by incidence of p"
with its epipolar line, we ha~-e:
p"--Fl3p X ~23P (12) Therefore, eight corresponding points across the three views are sufficient for a linear solution of the two fundamental matrices, and then all other object points can be re-projected onto the third view. Equation (1-7)is also a trilinear form, but not of the type introduced in Theorem 1. The dif~erences include (i) epipolar intersection requires the correspondences coming from eight points. rather than seven, (ii) the position of p" is solved by a line intersection process which is singular in the case the three camera centers are collinear; in the trilinearity result the components of p" are solved separately and the situation of three collinear cameras is a~lmi~ible. (iii) the epipolar intersection process is decomposable, i.e., only t-vo views are used at a time; whereas the epipolar geometries in the trilinearity result are intertwined and are not recoverable separately. The latter implies a better numerically behaved method in the presence of noise as well, and as will be shown later, the performance, even using the minim~1 number of required points, far e~ceeds the performance of epipolar intersection using many more points. In other words, by avoiding the need to recover the epipolar geometry we obtain a significant practical advantage as ~vell, since the epipolar geometry is the most error-sensitive component when ~vorliing with perspective vie-vs.

SUBSTITUTE SHEET (RULE 26~

WO !'f'3~6~ ? ~961~ ',6S7 Tlle connection between tlle general result of trilillear functions of views and the --linear coml~ination of views" result [.38] for orthographic views, can easily be seen by setting ~ and B to be affine in p2, and V3 = V3' = 0. For example, (3) reduces to vl~"--Vl/2/ + (vl'al--vlbl)Tp = O, wllicll is of the form ~1~ + ~2X + ~3~ + ~4Y + ~5 = ~
As in the perspective case, each point contributes four equations, but here there is no ad-v antage for using all four of them to recover the coefficients, therefore we may use onlv t-vo out of the four equations, and require four corresponding points to recover the coefficients.
Thus, in the case where all three views are orthographic, x" (y") is expressed a-, a linear combination of image coordinates of the two other views--as discovered by [38].
IV. THE BILINEAR FORM
Consider the case for which the two reference (model) views of an object are taken or-thographicallv (using a tele lens would provide a reasonable appr~?~im~tion), but during recognition any perspective view of the object is allowed. It can easily be shown that the three views are then connected via bilinear functions (instead of trilinear):
Theorem 3 (Bilinearity) Within the conditions of Theorem 1, in case the views z'l and U!2 are obtained by parallel projection, then the pair of trilinear forms of Theorem 1 reduce to the follozc?ng pair of bilinear equations:
2"(~1~ + ~2Y + ~3) + ~4~ 2 + ~S~ + ~6~ + ~,Y + ~8 = 0, and Y (~-?1~ + ~2Y + ~'3) + ~4Y"~'+ ~'5~'+ ~62 + ~',Y + ~'8 = 0-uhere cr; = ~?7j. j = 1,...,4.
Proof: Under these conditions we have from Lemma 1 that A is affine in p2 and V3 = 0, therefore (3) reduces to:
~ (V1b3--V3'a,)TP + V3'~ ~--U~/+ (V1'al--Vlbl)TP = 0.
~imil~rl~ ) reduces to:
y (Vlb3 -- Z~3al) p + V3y 2~ 2;2 + (V2al -- z lb2) p = o.

SU~I~Iu~ S~l~tl (RULE 26) wo9~3~ PCTrUS9~/0~697 Both equations are of the desirecl form, ~-ith the first four coefficients identical across both equations. O
The r~m~ining trilinear forms undergo a similar reduction, and Theorem 2 still holds. i.e., ~ve still have four linearly independent bilinear forms. Consequently, we have 21 coefficients up to a common scale (instead of '~7) and four equations per point, thus five corresponding points (instead of seven) are sufficient for a linear solution.
.~ bilinear function of three ~,-iews has two advantages over the general trilinear function.
First. as mentioned above, only five corresponding points (instead of seven) across three views are required for solving for the coefficients. Second, the lower the degree of the algebraic function, the less sensitive the solution may be in the presence of errors in measuring correspondences. In other words, it is likely (though not necessary) that the higher order terms, such as the term ~ in Equation 3, will have a higher contribution to the overall error sensitivity of the system.
Compared to the case when all views are assumed orthographic, this case is much less of an approximation. Since the model views are taken only once, it is not unreasonable to require that they be taken in a special way, namely, with a tele lens (assuming we are dealing with object recognition, rather than scene recognition). If this requirement is satisfied, then the recognition task is general since we allow any perspective view to be taken during the recognition process.

V. EXPERIMENTAL DATA
The e~;periments described in this section were done in order to evaluate the practical aspect of using the trilinear result for re-projection compared to using epipolar intersection and the linear combination result of [38] (the latter we have shown is simplv a limiting case of the trilinear result).
The epipolar intersection method was implemented as described in Section III by reco~er-ing first the fundamental matrices. Although eight corresponding points are suflicient for a linear solution, in practice one would use more than eight points for l~cov~ lg the fundamen-tal matrices in a linear or non-linear squares method. Since linear least squares methods are still sensitive to image noise~ we used the implementation of a non-linear method described in [~0] ~vhich was l;indly provi{led b- T. Luong and L. Quan (these were two implementations of the method proposed in [~0]--in each case. the implementation that provided the better SIJBSTITUTE SHEET (RULE 26~

W 096/3436~ PCTrUS96/0,~7 results ~vas adopted) The first experiment is with simulation data showing that even when the epipolar geometry is recovered accurately, it is still significantly better to use the trilinear result which avoids the process of line intersection. The second experiment is done on a real set of images, co nparing the performance of the various methods and the number of corresponding points that are needed in practice to achieve reasonable re-projection results.

A . Computer Simulations ~ e used an object of 46 points placed randomly with ~ coordinates between 100 units and 1~0 units, and :Z~, y coordinates ranging randomly between -125 and +12~. Focal length wa~s of 50 units and the first view was obtained by f2/7, fy/~ The second view (~2) ~vas generated by a rotation around the point (0, 0, 100) with axis (0.14, 0.7, 0.7) and by an angle of 0 3 radians. The third view (~63) was generated by a rotation around an axis (0,1, 0) with the same translation and angle. Various amounts of random noise was applied to all points that were to be re-projected onto a third view, but not to the eight or seven points that were used for recovering the parameters (fundamental matrices, or trilinear coefficients). The noise was random, added separately to each coordinate and with varying levels from 0.5 to 7.5 pixel error, We have done 1000 trials as follows: 20 random objects were created, and for each degree of error the simulation was ran 10 times per object. We collected the maximal re-projection error (in pixels) and the average re-projection error (averaged of all the points that were re-projected). These numbers were collected separately for each degree of error by averaging o~-er all trials (200 of them) and recording the standard deviation as well. Since no error were added to the eight or seven points that were used to determine the epipolar geometry and the trilinear coefficients. we simply solved the associated linear systems of equations required to obtain the fundamental matrices or the trilinear coefficients.
The results are shown in Figure lA-lB. The graph on the left shows the performance of both algorithms for each level of image noise by measuring the m~Yim~l re-projection error.
~ e see that under all noise levels, the trilinear method is significantly better and also has a smaller standard deviation. Similarly for the average re-projection error shown in the graph on the right This difference in performallce is expected, as the trilinear method tal;es all three views together rather than ever-- pair separately, and thus avoids line intersections SUBSTITUTE SHEET (RULE 26) W O~'/3~ PCTrUS9610S697 B. L'll)frimfn~s On fi'~el Ima~
Figure ~A-2C shows three views of the object ~ve selected for the e~;periment. The object is a sports shoe with added texture to facilitate the correspondence process. This object ~-as chosell because of its comple~ity, i.e., it has a shape of a natural object and cannot easily he described parametrically (as a collection of planes or algebraic surfaces). Note that the situation depicted here is challenging because the re-projected vie~v is not in-between the two model views, i.e.. one should expect a larger sensitivity to image noise than in-bet~een situations. A set of 3~ points were manually selected on one of the frames. 7~!7lt and their correspondences were automatically obtained along all other frames used in this experiment.
The correspondence process is based on an implementation of a coarse-to-fine optical-flo~
algorithm described in [7]. To achieve accurate correspondences across distant views, in-termediate in-between frames were taken and the displacements across consecutive frames ~ ere added. The overall displacement field was then used to push ("warp") the first frame to~vards the target frame and thus create a synthetic image. Optical-flow was applied again bet~veen the synthetic frame and the target frame and the resulting displacement was added to the overall displacement obtained earlier. This process provides a dense displacement field which is then sampled to obtain the correspondences of the 34 points initially chosen in the first frame. The results of this process are shown in Figure 2A-2C by displaying squares centered around the computed locations of the corresponding points. One can see that the correspondences obtained in this manner are reasonable, and in most cases to sub-pi~cel ac-curacy. One can readily automate further this process by selecting points in the first frame for ~vhich the Hessian matri~ of spatial derivatives is well conditioned--similar to the con-fidence values suggested in the implementations of [4, 7, 3~]--ho-vever, the intention here was not so much as to build a complete system but to test the performance of the trilinear re-projection method and compare it to the performance of epipolar intersection and the linear combination methods.
The trilinear method requires at least seven corresponding points across the three ~ie~ s (we need ~6 equation, and seven points provide 2~ equations), whereas epipolar intersection can be clone (in principle) with eight points. The question we are about to address is ~vhat is the mlmber of points that are required in practice (due to errors in correspondence, lens distoltiolls and other effects tllat are not adequately modeled by tlle pin-hole camera model) to acllie~-e reasonable performance?

SUBSrllUl E SHEEl- ~RUI~

W 096/34365 - PCTrUS96/05697 The trilinear result ~,vas first applied with the minin~.l number of points (seven) for solving for the coefficients, and then applied with S,9, and 10 points using a linear least-squares solutioll (note that in general, better solutions may be obtained by using SVD or Jacobi methods instead of linear least-squares, but that was not attempted here). The results are shown in Figure .3A-3B. Seven points provide a re-projection with maximal error of 3.3 pixels and average error of 0.98 pixels. The solution using 10 points provided an improvement with maximal error of 1.44 and average error of 0.44 pixels. The performance using eight and nine points was reasonably in-between the performances above. Using more points did not in~prove significantly the results; for example, when all 34 points were used the mA~imAl error went down to 1.14 pixels and average error stayed at 0.42 pixels.
~ ;ext the epipolar intersection method was applied. We used two methods for recovering the fundamental matrices. One method is by using the implementation of [20], and the other is bv taLiing advantage that four of the corresponding points are coming from a plane (the ground plane). In the former case, much more than eight points were required in order to achieve reasonable results. For example, when using all the 34 points, the m;~imAl error was 43.4 pixels and the average error was 9.58 pixels. In the latter case, we recovered first the homography B due to the ground plane and then the epipole v" using two additional points (those on the film cartridges). It is then known (see [28, 21, 32]) that Fl3 = [v"]B, ~vhere [v"] is the anti-symmetric matrix of v". A similar procedure was used to recover F23.
Therefore, onlv six points were used for re-projection, but nevertheless, the results were slightlv better: maximal error of 25.7 pixels and average error of 7.7 pixels. Figure 4A-4B
shows these results.
Finally. we tested the performance of re-projection using the linear combination method.
Since the linear combination method holds only for orthographic views, we are actually testing the orthographic assumption under a perspective situation, or in other words? whether the higher (bilinear and trilinear) order terms of the trilinear equations are significant or not.
The linear combination method requires at least four corresponding points across the three ~ iews. ~ e applied the method with four, 10 (for comparison with the trilinear case shown in Figure 3A-3B), and all 34 points (the latter t-vo using linear least squares). The results are displayed in Figure 5A-5C. The performance in all cases are significantly poorer than \~ hen using the trilinear functions, but better than the epipolar intersection method.

SUBSlll~llt SHEEr (I~UJ E 26) w 096/34365 PCTnUS96/05697 VI. DISCUSSION
~ e have seen that any view of a fi~ed 3D object can be e.~pressed as a trilinear function with t~vo reference views in the general case, or as a bilinear function when the reference views are created by means of parallel projection. These functions provide alternative, much simpler, means for manipulating views of a scene than other methods. Moreover, they require fewer corresponding points in theory, and much fewer in practice. Experimental results show that the trilinear functions are also useful in practice yielding performance that is significantly better than epipolar intersection or the linear combination method.
In general two views admit a ;fundamental" matriY (cf [10]) representing the epipolar geometry between the two views, and whose elements are subject to a cubic constraint (ranl;
of the matri~ is '~). The trilinearity results (Theorems 1,2) imply that three views admit a tensor with '~7 distinct elements. We have seen that the tensor does not fail in cases where the epipolar constraint fails, such as when the three cameras are along a straight line (not an uncommon situation). The issue of singular configurations of 7 points (besides the obvious singular configuration of 7 coplanar points) was not addressed in this manuscript. However, the robustness of the re-projection results may indicate that either such configurations are very rare or do not exist. It would be, thus, important to investigate this issue as it is widely believed that the numerical instability of the epipolar constraint lies in the e~cistence of such critical surfaces. The notion of the "trilinear" tensor, its properties, relation to the geometry of three views, and applications to 3D reconstruction from multiple views, constitutes an important future direction.
~ he application that was emphasized throughout the manuscript is visual recognition via alignment. Reasonable performance was obtained with the minimAl number of required points (seven) with the novel view (~3)--which may be too many if the image to model matching is done by trying all possible combinations of point matches. The e~cistence of bilinear functions in the special case where the model is orthographic, but the novel vie-v is perspective, is more encouraging from the standpoint of counting points. Here we have the result that only five corresponding points are required to obtain recognition of perspective ~-iews (provided we can satisfy the requirement that the model is orthographic). We have not e.~iperimented ~,-ith bilinear functions to see hotv many points would be needed in practice, bul plan to do that in the future. Because of their simplicit~, one may speculate that these algebr~ic tunctions will find uses in tasl;s otller tllan visual recognition--some of those are SllBSJIl UT~ SH~ tRUL~ 26~

w 096/34365 PCTrUS96/05697 cliscussed helow.
There mav e~cist other applications where simplicity is of major importance, whereas the number of points is less of a concern. Consider for e~cample, the application of model-based compression. With the trilinear functions we need 17 parameters to represent a vie~v as a function of two reference views in full correspondence (recall, 27 coefficients were used in order to reduce the number of corresponding points from nine to seven). Assume both the sender and the receiver have the two reference views and apply the same algorithm for obtaining correspondences between the two views. To send a third view (ignoring problems of self occlusions that may be dealt with separately) the sender can solve for the 1~ parameters using manv points, but eventually send only the 17 parameters. The receiver then simply combines the two reference views in a "trilinear way" given the received parameters. This is clearly a domain where the number of points is not a major concern, whereas simplicity. and robustness (as shown above) due to the short-cut in the computations, is of great importance.
Related to image coding, an approach of image decomposition into "layers ' was recently proposed by [1, 2]. In this approach, a sequence of views is divided up into regions, whose motion of each is described approximately by a 2D affine transformation. The sender sends the first image followed only by the six affine parameters for each region for each subsequent frame. The use of algebraic functions of views can potentially make this approach more powerful because instead of dividing up the scene into planes one can attempt to divide the scene into objects, each carries the 17 parameters describing its displacement onto the subsequent frame Another area of application may be in computer graphics. Re-projection techniques pro-v ide a short-cut for image rendering. Given two fully rendered ~,-iews of some 3D object. other ~,-iews (again ignoring self-occlusions) can be rendered by simply "combining' the reference v ie~vs. Again. the number of corresponding points is less of a concern here.

REFERENCES
[1] E.H. Adelson. Layered representations for image coding. Technical Report 181, Media LaborAtorv, Massachusetts Institute of Technology, 1991.
[ ~] E.H. Adelson and J.Y.A. ~Vang. Layered representation for motion analysis In Proceed-iltgs IEEE Conf. on Corn~7tltter l;ision and Pattern Recognition, pages 361-366, l\ew Yorl;. ~ June 199;3.

gUBSrlTUrE SI~E~ (RU~~ ~6 CA 022l93l4 l997-l0-24 W O9f'31~
PCr/u~- ~ 'O~,C~7 [33 C.. Adiv. Inherellt amhiguities in recovering 3D motion and structure fron1 a noisy flo~v field. IEEE Transactions on Pattern Anal~sis and Mac~ine Intelligence, P.
11(.5):~L77-4S9, 19S9.
[~] P Anandan. A unified perspective on computational techniques for the measurement of visual motion. In Proceedings Irnage Understanding ~/Vor~~shop, pages 219-730, Los .~ngeles, CA, February 1987. Morgan K2llfm~nn, San Mateo, CA.
[.~] I..~. Bachelder and S. Ullman. Contour matching using local affine transformations. In Proceedings Image Understanding Workshop. Morgan K~llfmAnn, San ~Iateo, CA, 1997.
[6] E.B. Barrett, M.H. Brill, N.N. Haag, and P.M. Payton. Invariant linear methods in photogrammetry and model-matching. In J.L. Mundy and A. Zisserman. editors. ~p-plications of invariance.s in computer vision. MIT Press, 1992.
[7] J.R. Bergen and R. Hingorani. Hierarchical motion-based frame rate conversion. Tech-nical report, David Sarnoff Research Center, 1990.
[S] S. Demey, A. Zisserman, and P. Beardsley. Affine and projective structure from motion.
In Proceedings of the British A/Iachine Vision Conference, October 1992.
[9] R. Dutta and M.A. Synder. Robustness of correspondence based structure from motion.
In Proceedings of the International Conference on Computer Vision pages 106-110,Osaka, Japan, December 1990.
[10] O.D. Faugeras. What can be seen in three dimensions with an uncalibrated stereo rig?
In Proceedings of the European Conference on Computer Vision, pages 563-57S, Santa ~Iargherita Ligure. Italy~ June 1992.
[11] O.D. Faugeras and L. Robert. ~/Vhat can two images tell us about a third one? Technical Report INRIA, France. 1993 [1_] ~.E.L. Grimson. Why stereo vision is not always about 3D reconstruction. A.I. ~Iemo ~o. 1~i35, Artificial Intelligence Laboratory, Massachusetts Institute of Technology. July 1993.
[13~ R. Hartley, R. Gupta, and T. Chang. Stereo from uncalibrated cameras. In Proceedings IEEE Conf. on Computer l~ision and Pattern Pecognition, pages 761-76~, Champaign, IL., June 1992.
- [1~] B.l;.P. Horn. Relative orientation. ~nternational .Journal of Computer 'i'ision, ~ 9-7S, 19~70 [l ~] B I; P. Horn Relative orientation revisited. Journal of thc Optical .Society of ~merica, SUBSTITUTE SHEET (RULE 26) W 096t34365 PCT/U~ C~7 ~:16:30-163S, 1991.
[1~] D.P. Huttenlocher and S. Ullman. Recognizing solid objects by alignn~ent with an image.
International Journal of Computer Vision, 5(2):195-212, 1990.
[1/] J.J. I;oenderinLi and A.J. Van Doorn. Affine structure from motion. Jou..-nal of the Optical Society of America, S:377-3S5, 1991.
[lS] H.C. Longuet-Mig~inq A computer algorithm for reconstructing a scene from two pro-jections. Nature, 293:133-135, 19~31.
[19] H.C. Longuet-~iggin~ the reconstruction of a scene from two projections - configura-tions that defeat the 8-point algorithm. In Proc. of the 1st Conf. ,~I applications. pages 395-397, Denver, December 19~4.
[70] Q.T. Luong, R. Deriche, O.D. Faugeras, and T. Papadopoulo. On determining the fundamental matri~: Analysis of dif~erent methods and experimental results. Technical Report INRIA, France, 1993.
[21] Q.T. Luong and T. Vieville. Canonical representations for the geometries of multiple projective views. Technical Report INRIA, France, fall 1993.
[~] S.J. Maybank. The projective geometry of ambiguous surfaces. Proceedings of the Royal Society of I,ondon, 332:1-47, 1990.
[23] J. Mundy and A. Zisserman. Appendix--projective geometry for machine vision. In J. Mundy and A. Zisserman, editors, Geometric invariances in computer vision. ~vIIT
Press, Cambridge, 1992.
[24] J.L. Mundy, R.P. ~Velty, M.H. Brill, P.~vI. Payton, and E.B. Barrett. 3-D model align-ment without computing pose. In Proceedings Image Understanding l/VorJ~~shop. pages l')/-7~5. Morgan Kaufmann, San Mateo, CA, January 1992.
[7-~] A. Shashua. Correspondence and affine shape from two orthographic views: ~Iotion and Recognition. A.I. Memo No. 1327, Artificial Intelligence Laborator~,-, Massachusetts Institute of Technology, December 1991.
[~6] A. Shashua. Geometry and Photometry in ~D visual recognition. PhD thesis. ~I.I.T
Artificial Intelligence Laboratory, AI-TR-1401, November 1992.
[~/] A. Shashua. Illl-minAtion and view position in 3D visual recognition. In S.J. Hanson J.E. Moody and R.P. Lippmann, editors, Advances in Neural Information ProcessingS~stcms 4, pages 404~11. San Mateo, CA: Morgan I~A11fmAnn Publishers. 1992. Pro-ceedings of the fourtll annual conference NIPS, Dec. 1991, Denver, CO.

SUBSmU~ S~ E 26) [2S] A. Shashua. On geometric and algebraic aspects of 3D affine and projecti~e str~lctures from perspective 2D views. In The 2nd Europealz WorJ~-shop on I7zvariants. Ponta Dela-gada. Azores, October 1993. Also in MIT AI memo No. 1405, J~lly 199.~
[29] A. Shashua. Projective depth: A geometric in~-ariant for 3D reconstruction from t~,o perspective/orthographic views and for visual recognition. In Proceedings of the Inter-national Conference on Computer Vision, pages 583-590, Berlin, Germany, May 1993.
~30] A. Shashua. Projective structure from uncalibrated images: structure from motion and recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1994. in press.
[31] A. Shashua. Trilinearity in visual recognition by alignment. In Proceedings of the Euro-pean Conference on Computer Vision, Stockholm, Sweden, May 1994.
[32] A Shashua and N. Navab. Relative af~ine structure: Theory and application to 3d reconstruction from perspective views. In Proceedings IEEE Conf. on Comp7~ter Vision and Pattern Recognition, Seattle, Washington, 1994.
[33] A. Shashua and S. Toelg. The quadric reference surface: Applications in registering views of comple~c 3d objects. In Proceedings of the E7lropean Conference on Computer Vision, Stockholm, Sweden, May 1994.
[34] C. Tomasi and T. Kanade. Factoring image sequences into shape and motion. In IEEE
~Vorkshop on Visual A/lotion, pages 21-29, Princeton, NJ, September 1991.
[35] R.Y. Tsai and T.S. Huang. Uniqueness and estimation of three-dimensional motion parameters of rigid objects with curved surface. IEEE Transactions on Pattern Analysis and ~lachine Intelligence, PAlMI-6:13-26~ 19S4.
[36] S. Ullman. The Interpretation of Visual A/lotion. MIT Press, Cambridge and London, 19~9.
[37] S. Ullman. Aligning pictorial descriptions: an approach to object recognition. Cognition, 32:193-254, 1989 Also: in ~IIT AI Memo 931, Dec. 19S6.
[3S] S. Ullman and R. Basri. Recognition by linear combination of models. IEEE Transac-tions on Pattern Analysis and A~lachine Intelligence, PAMI-13:992--1006. 1991. Also in M.I.T AI Memo 1052, 19S9.
[39] D. ~Veinshall. Model based invariants for 3-D vision. International Jol~rnal of Computer l;ision, 10(1):27-42, 1993.

SUBS~ lE S~EET ~RllLE 26 W 096/34365 PCTnUS96/05697 It is appreciated that some or all of the image proc-essing operations shown and described herein may be implemented in software. Alternatively, some or all of the image processing operations shown and described herein may be implemented in ROM
(read-only memory) form. The software components may, alterna-tively, be implemented in hardware, if desired, using convention-al techniques.
The present invention has very broad applications and specifically is applicable in all fields in which 3D from 2D
t~chn;ques are known to be useful. Applications of the present invention include at least the following: photogrammetry applica-tions comprising map making from aerial and satellite photographs and coordinate measurements in aerospace and shipyard assembly plants, coordinate measurements of industrial parts (CMM), automated optical based inspection of industrial parts, robotic cell alignment, robotic trajectory identification, 3D robotic feedback, 3D modelling of scenes, 3D modelling of objects, re-verse engineering, and 3D digitizing.
Appendix B is a listing of a preferred software imple-mentation of 3D reconstruction apparatus constructed and opera-tive in accordance with a preferred embodiment of the present invention. To implement the 3D reconstruction apparatus based on Appendix B, Maple2 software, commercially available from Math-Soft, may be employed in conjunction with an IBM PCT or SUN
workstation. The following procedure may be employed:
a. Programs generated from the listing of Appendix B may be loaded into Maple2 using the OPEN FILE command.
b. The subroutines may be run by bringing the cursor to the row on which appears the WITH (LINALG) command.
c. The RETURN key is then processed until the cursor reaches the end of the file.
d. The cursor is then returned to the beginning of the file.
e. The simulation is run by pressing RETURN until the following row is reached, inclusive:
EVALN (PCOMP) SU~STITUTE SHEET (RULE 26) W 096134365 PCT/U~,.'O~G~7 The system then prints on the screen a comparison between the given 3D world and the reconstructed 3D world.
It is appreciated that the listing of Appendix B is also useful in implementing the image transfer apparatus and method shown and described herein. The image transfer embodiment may, for example, be based on the listing of Appendix B used in conjunction with Equations 3 and 4 of the above section entitled Algebraic Function for Recognition.
It is appreciated that the particular embodiments de-scribed in the Appendices is intended only to provide an extreme-ly detailed disclosure of the present invention and is not in-tended to be limiting.
It is appreciated that various features of the inven-tion which are, for clarity, described in the contexts of sepa-rate embodiments may also be provided in combination in a single embodiment. Conversely, various features of the invention which are, for brevity, described in the context of a single embodiment may also be provided separately or in any suitable subcombina-tion.
It will be appreciated by persons skilled in the art that the present invention is not limited to what has been par-ticularly shown and described hereinabove. Rather, the scope of the present invention is defined only by the claims that follow:

SUBSTITUTE SHEET (RULE 26) .~ ~
W 096134365 PCT/u'~-~o5G97 APPENDIX A

su~snrurE SHttl (RULE 26) . .
W 096/34365 PCTrUS9~'0~6~7 BOOL ComputeAllTensors( IN int nTrials, TENS
Tensors[MAX_FRAMES] [MAX_FRAMES]
[MAX_FRAMES], OUT DMAT3 E_basis[MAX_FRAMES]
[MAX_FRAMES] [4], OUT int *size_E_basis) {

int rf,bf,i,j,k,l,m,n;
int num_frames_for_i;
int legalFrame[MAX_FRAMES];
double **rank4mat, *w, **v_tr;
double thresh4;
BOOL foundThresh;
double sum;
int *dummy; /* for the call of findTensor */
/* check the the number of frames is not too large */
if (NumberOfFrames > MAX_FRAMES) {
return(FALSE);
}

/* loop on the reference frame */
for (rf=0; rf<NumberOfFrames; rf~) {
/* consider only marked frame */
if (((specified_rf < 0) 11 (rf specified_rf)) &&
Images_Flags[rfl) {
/* loop on the i'th frame */
for (i=0; i<NumberOfFrames; i I I ) {
/* consider only marked frame different of rf */

SUBSTITUTE SHEET (RULE 26 ~ . = - = :

if (Images_Flags[i] && (i != rf)) {
/* initialize num_frames_for_i */
(*size_E_basis) = 0;
nurn_frames_for_i = 0;
/* loop on the base frame */
* There is dependence between * Tensors[rfl[i][bf~ and Tensors[rf~[bf~[i]
* so we avoid the computation of Tensors[rf3[i][bf~
* for bf<i */
for (bf=(i+l); bf<NumberOfFrames; bf++) {
/* consider only marked frame different of rf */
if (Images_Flags[b~ && (bf != rf)) {
findTensor(Tensors[rf~[i][bf~, rf, i, bf, 0, dummy, 0, nTrials);
legalFrame[num_frames for_i] = bf;
num_frames_for_i I l;
}
}

for (bf=0; bf~i; bf I I ) {
if (Images_Flagstbi~ && (bf != rf~) {
for (j=0; j<3; j I I ) {
for(k=0; k<3; kl 1) {
for (1=0; 1<3; 1 1 1 ) {
Tensors [rf~ [i] [bf~ rj] [k] [1] =
Tensors [rf~ [bf~ [i] [k] Lj ] [l];
}
}

legalFrame[num_frames_for_i] = bf;

SUBSTITUTE SHEET (RULE 2.~;

CA 02tl9314 1997-10-24 WO 9~ /3,1?l'5 PCTrUS96/05697 num_frames_for_i++;
}
}

if (num_frames_for_i > 1 ) {
/* there are num_frames_for_i Tensors for these */
/* rf and i and when ordering the EO,El ,E2 to a */
/* 9-by-(3*num_frames_for_i) matrix its ran~4 */
/* Note that E[j] = Tensors[rf~[i][bf~[*]rj][*] */
(*size_E_basis) = 4;
/* if num_frames_for_i>2 we consider the transposed */
/* matrix for the svdcmp which requires rows >= cols */
if(num_frames_for_i 2) {
rank4mat = dmatrix( 1,9,1,6);
w= dvector(l,6);
v _ tr= dmatrix(1,6,1,6);
for (bf=O; bf<num_frames_for_i; bf++) {
for G=~; i<3; j I I ) {
for (k=O; k<3; k++) {
for (1=0; 1<3; 1++) {
rank4mat[1+3*k+1][1+ 3*bf+j] =
Tensors[rf~[i][legalFrame[bf~][k~G][l];

}
}
}

if (! svdcmp(rank4mat, 9, 6, w, v_tr)) {
free_dmatrix(rank4mat, 1,9, 1,6);
free_dvector(w, 1 ,6);
free_dmatrix(v_tr, 1,6, 1,6);
return(FALSE);

~U~SIilUI~ S~E~ (RU~E 26) W 096/34365 PCTrUS96105697 /* look for threshold for the largest 4 values */
foundThresh = FALSE;
thresh4 = 0.02;
while (!foundThresh) {
n = 0;
for (k=l; k<=6; k++) {
if (w[k] > thresh4) {
n++;
}
}

if (n>4) {
thresh4*=2;
}

else if (n<4) {
thresh4/=2;
}

else {
foundThresh = TRUE;
}

/* the 4 columns ~f rank4mat which correspond to */
/* the 4 largest singular values form E_basis */
/* Note that E_basis is [4][9] i.e. goes by rows */
n = 0;
for (m=l; m<=6; m++) {
if (w[m] > thresh4) {
for (k=0; k<3; k++) {
for (1=0;1<3; 1++) {
E_basis[r~l [i] [n] [k] [1] =
rank4mat[ 1+3 *k+l] [m];

SUBSTITUTE SHEET (RULE 26 W 096/34365 PCTrUS~/lr~97 n++;
}

if(!ImproveE_basis(E_basis[rf~[i])) {
free_dmatrix(~rank4mat~ 1;9; 1;6);
free_dvector(w, 1 ,6);
free dmatrix(v_tr, 1,6,1,6);
return(FALSE);
}

/* improve the tensors with the svd results */
for (bf=O; bf<num_frames_for i; bf I I ) {
for (j=O; j<3; j I I ) {
for (k=O; k<3; k++) {
for (1=0; 1<3; 1++) {
sum = 0.0;
n=O;
for(m=i; m<=6; mi I j {
if (w[m] > thresh4) {
sum += E_basis[rf~[i][n][k][l]*
w[m]*v_tr[1+ 3*bf+j][m];
n++;
}

}
Tensors[rf~[i][legalFrame[bf~][k]cj][l] = sum;

}

SUBS mm SH0 (RUl~ 2~) W 096/34365 PCT/U~''15697 free_dmatrix(rank4mat, 1,9, 1,6);
free_dvector(w, 1 ,6);
free_dmatrix(v_tr, 1,6, 1,6);
}

else {
rank4mat = dmatrix( 1,3 *num_frames_for_i, 1,9);
w= dvector(l,9);
v_tr= dmatrix(1,9,1,9);
for (bf=0; bf<num_frames_for_i; bf++) {
for(j=O;j<3;j 1 1 ){
for (k=0; k<3; k~t) {
for (1=0; 1<3; l~) {
rank4mat[1+ 3*bf+j][1~3*k+1] =
Tensors~rf,l[i][legalFrame[bfl][k]rj][l];
}
}
}

if (!svdcmp(rank4mat, 3*num_frames_for_i, 9, w, v_tr)) {
free_dmatrix(rank4mat, 1,3 *num_frames_for_i, 1,9);
free _ dvector(w, l,9);
free_dmatrix(v_tr,1,9, 1,9);
return(FALSE);
}

/* look for threshold for the largest 4 values */
foundl~resh = FALSE;
thresh4 = 0.02;
while (!foundThresh) n G o;
for (k=1; k<=9; k~) {
if (w[k] > thresh4) {
n~;

SUBSTITUTE SHEET (RULE 26 W O9"~6~ ~ PCTrU~35~0~C97 if (n>4) {
thresh4*=2;
}
else if (n<4) {
thresh4/=2;
}

else {
foundThresh = TRUE;
}

/* the 4 rows of v which correspond to the */
/* 4 largest singular values form E_basis */
n =O;
for (m=l; m<=9; m I I ) {
if (w[m] > thresh4) {
for(k=O; k<3; kl 1) {
for (1=0; 1<3; 11 1 ) {
E_basis[rf3 [i] [n] [k] [1] = v_tr[ 1+3 *k+l] [m];
}
}

n++;
}

if(!ImproveE _ basis(E _ basis[rf~[i])) {
free_dmatrix(rank4mat, 1,3 *num_frames_for_i, 1,9);
free _ dvector(w, 1,9);
free_dmatrix(v_tr, 1,9, 1,9);
return(FALSE);
}

SuBsTlTuTE SHEET (RULE 2 W 096134365 PCTrUS9.~r~7 /* improve the tensors with the svd results $/
for (bf=O; bf<num_frames_for_i; bf++) {
for (j=O; j<3; j~) {
for (k=O; k<3; k++) {
for (1=0; 1<3; 1~) {
sum= 0.0;
n=O;
for (m=1; m<=9; m I I ) {
if (w[m] > thresh4) ~
sum += rank4mat[1+3*bf+j][m]*w[m]*
E_basis[rf~ [i] [n] [k] [l];
n~;
}
}

Tensors[rf3[i][legalFrame[bf3][k]rj][l] = sum;
} }

}
free_~lm~ix(rank4mat, 1,3 *num_frames_for_i, 1,9);
free_dvector(w, 1 ,9);
free_dmatrix(v_tr, 1,9, 1,9);
}

else if (num_frames_for_i = 1 ) {
(*size_E_basis) = 3;
/* keep the tensor as the first 3 of E_basis[rf~ [i] */
for (j=O; j<3; j++) {
for (k=O; k<3; k~) {
for (1=0; 1<3; 1~) {
E_basis[rf3[i]rj][k][l] =

SU~SmUlE S~EEr (RULE 2~) W 09~3~ PCTrUS~3697 Tensors[rf~[i][legalFrame[0]][k]~j][1];

}

}
}
}

return(TRUE);
}

ComputeAllEpipoles(IN DMAT3 E_basis[MAX_FRAMES]
[MAX_FRAMES] [4], IN int size_E_basis, IN BOOL removeOutliers, OUT DXYZ Epipoles[MAX_FR~MES]
[MAX_FRAMES]) {

int rf, i;
double error;
int n_Eq;
int row, 1, j, k, m;
double **Eqn;
int nCols=3, nSizes=l, nGroups[l], Sizes[l];
int minRows=4, nTrials=100, n_E~new;
double row_sum;
DMAT3 mat;
DMAT3 mat3, Epimat;
/* check the the number of frames is not too large */

SUBSTITUTE SHEET (RULE 26) .

WO 96/34365 PCI'/US96/OS697 if (NumberOfFrames > MAX_FRAMES) {
return(FALSE);
/* given a pair of images (rf,i): */
/* there are 3*size_E_basis*(size_E_basis-l) + */
/* 6*size_E_basis*size_E_basis equations */
Eqn = dmatrix(0, 3*size_E_basis*(size_E_basis-l) +
6*size_E_basis*size_E_basis - 1, 0, 2);
/* loop on the reference frames */
for (rf=0; rf<NumberOfFrames; rf++) {
/* consider only marked frames */
if (((specified_rf < 0) 11 (rf specified_rf)) && Images_Flags[rf~) ~
/* loop on the i'th frame */
for (i=0; i<NumberOfFrames; i++) {
/* consider only marked frames that differ from rf */
if (Images_Flagsri] && (i != rf)) {
/* accumulate equations on Epipole */
n Eq=0;
/* contribution of Tensors[rfl[i][bf~[*]Li][*] */
/* denoting EG]=Tensors[rf~[i][bf~[*][j][*] and */
/* Erj][k]=Tensors[rf~[i][bf~[*][j][k] then */
/* [v'] E[0] ~ [v ] E[l] ~~ [v'] E[2] ~ - F */
/* the fundamental matrix. */
/* In particular, transposed(E~]) * [v']* E[l] */
/* is a skew symmetric matrix and therefore */
/* v' CROSS E[j][k] dot E[l][k] = 0 forj<1 */
/* and also v' CROSS EG][k] dot E[l][m] + */

SUBSTITUTE SHEET (Rl:JLE 26) W 096/34365 PCTrUS96/05697 /* v' CROSS Erj][m] dot E[l][k] = 0 for j<l, k<m */
/* Equivalently (being the same "volume") */
/* Erj][k] CROSS E[l][k] dot v'=0 forj<l k=0, 1,2*/
/* and also Erj][k] CROSS E[l][m] dot v' + */
- /* E[j][m] CROSS E[l][k] dot v'=0 for j<l k<m */
for (row=0;
row<(3 *size_E_basis*(size_E_basis- 1 ) +
6*size_E_basis*size_E_basis);
rowl 1) {
for (l=0; 1<3; 11 1 ) {
Eqn[n_Eq+row][l] = 0.0;
}

}
/* loop on E_basisrj] */
for (j=0; j<(size_E_basis-l); j I I ) {
/* loop on E_basis[l] */
for (~ +l); l<size_E_basis; 11 1 ) {
/* loop on the column */
for (k=0; k<3, k++) {
Eqn[n_Eq] [0] = E_basis[rf~ [i] rj] [ 1 ] [k] *
E_basis[rf~[i][l][2][k] ~
E_basis[rf l[i]CI][2][k]*
E_basis[rf~[i][l][l][k];
Eqn[n_Eq] [ 1 ] = E_basis [rf~ [i] [l] [0] [k] *
E_basis[rf~ [i] rj] [2] [k] -E_basis[rf~¦ [i] [l] [2] [k] *
E_basis[rf~ [i] rj] [o] [k];
Eqn[n_Eq][2] = E_basis[rf [i]G][0][k]*
E_basis[rf~ [i] [l] [ 1 ] [k] -SU~ Itt~ I (RULE26) PCTAUSS5.'0'C~7 E_basis[rf~[i]~i] [ 1 ] [k] *
E_basis [rf~ [i] ~l] [0] [k];
/* normalize the equation */
row sum = Eqn[n_Eq]tO]*Eqn[n_Eq][0] +
Eqn[n_Eq]r1]*Eqn[n_Eq]tl] +
Eqn~n_Eq] r2] *Eqn[n_Eq] ~2];
if (row_sum > 0.0) {
row_sum = sqrt(row_sum);
Eqn[n_Eq][0]/=row sum;
Eqn~n_Eq] [ I 3/=row_sum;
Eqn~n_Eq] [2]/=row_sum;
n_Eq++;
}

}
for (k=0; k<2; k ~
for(m=(k+1); m<3; ml 1) {
Eqn[n_Eql[O~--E_basis[r~[i]cl3[l][k]*
E_basis~rf~ [i] [1~ ~2] [m] -E_basis~rf~ [i] ~j] [2] ~k] *
E_basis~rf~i]tl]~l]~m] +
E_basis~rf3[i]~i]~l][m]*
E_basis ~rf~ [i3 [l~ [2] [k] -E_basis[rf~[i]~J~[2][m]*
E_basis[rf3[i]rl][1][k];
Eqn~n_Eq][1~ = E_basis[rf~i]~l]~O]~m]*
E_basis~r~i]~ 2]~k] -E_basis~ffl[i][l][2][m]*
E_basis[rf~i]~j][O][k~ +
E_basis[rf~[i][l][~][k]*
E_basis~rf~ ~i] [1 ] [2] ~m] -E_basis~ri~ [i] [1] [2] ~k] *
E basis~rf~[i]~j]~O]~m~;

SUBSTITUTE SHEET (RULE 26) _ _ W 096/34365 PCT~US96J~5G~7 Eqn[n_Eq][2]--E_basis[rf~[i]~j][O][k]*
E_basis[rf~ [i] [1] [ 1 ] [m] -E_basis~rf~[i]ri][l][k]*
E_basis[rf~[i][l][O][m] +
E_basis[rf~[i]Ci][~][m]*
E_basis[rf~ [i] [1] [ 1 ] [k] -E_basis[rfil[i]~][l][m]*
E_basis[rf~ [i] [1] [O] [k];
/* norm~ e the equation */
row_sum = Eqn[n_Eql[O]*Eqn[n_Eq][O] +
Eqn[n_Eq][1]*Eqn[n_Eq][1] +
Eqn[n_Eq] [2] *Eqn[n_Eq] [2];
if (row_sum > 0.0) ~
row_sum= sqrt(row_sum);
Eqn[n_Eq] [O]/=row_sum;
Eqn[n_Eq] [ 1 ]/=row_sum;
Eqn[n_Eq] [2]/=row_sum;
n_Eq++;
} }
}

/* additional size_E_basis*size_E_basis*6 eqns */
/* come from [v']*E[rf [i]rj]*E[i][rf~[k] being */
/* skew symmetric for any j,k. */

if (specified_rf c o) {
/* loop on E_basis[rf~[i]r;] */
for (j=O; j<size_E_basis; j++) {
/* loop on E_basis[i][rf~[l] */

~1 SUBSTITUTE SHEET (RULE 26) PCI~/US96/05697 for (1=0; I<size_E_basis; I++) {
matmulmat3(mat, E_baSiS[rfl[i][i], E_basis[i] [rfl [l]);
/* ignore mat if it is a scalar matrix */
if ((mat[0][1] > mm_epsilon) (mat[0][1] <-mm_epsilon) (mat[0][2] > mm_epsilon) (mat[0][2] <-mm_epsilon) (mat[1][2] > mm_epsilon) (mat[1][2] <-mm_epsilon) (mat[l][0] > mm_epsilon) (mat[l][0] < -mm_epsilon) (mat[2][0] > mm_epsilon) (mat[2][0] <-mm_epsilon) (mat[2][1] > mm_epsilon) 11 (mat[2][1] < -mm_epsilon)) {

Eqn[n_Eq][0] = 0 0;
Eqn[n_Eq][l] = mat[2][0];
Eqn[n_Eq] [2] = -mat[ 1 ] [0];
/* normalize the equation */
row sum = Eqn[n_Eq][O]*Eqn[n_Eq][0] +
Eqn[n Eq][l]*Eqn[n_Eq][l] +
Eqn[n_Eq~ [2] *Eqn[n_Eq] [2];
if (row_sum > 0.0) {
row_sum = sqrt(row_sum);
Eqn[n_Eq] [0]/=row_sum;
Eqn[n_Eq] [ 1 ]/=row_sum;
Eqn[n_Eq] [2]/=row_sum;
n_Eq~;
}

~i2 S~ J~ (RlllE2C) PCTrU~3~,'0~6~7 Eqn[n_Eq] [O] = -mat[2] [ 1 ];
Eqnrn_Eq][1] = 0.0;
Eqn[n_Eq][2] = mat[O][l];
/* normalize the equation */
- row_sum = Eqn[n_Eq] [O] *Eqn[n_Eq] [O] +
Eqn[n Eq][1]*Eqn[n_Eq][l] +
Eqn[n_Eq] [2] *Eqn[n_Eq] [2];
if (row_sum > 0.0) {
row_sum = sqrt(row_sum);
Eqn[n_Eq] [O]/=row_sum;
Eqn[n_Eq] [ 1 ]/=row_sum;
Eqn[n_Eq] [2]/=row_sum;
n_Eq++;
}

Eqn[n_Eq][O] = mat[l]~2];
Eqn[n_Eq] [ 1 ] = -mat[O] [2];
Eqn[n_Eq][2] = 0.0;
/* norrnalize the equation */
row_sum = Eqn[n_Eq][O]*Eqn[n_Eq][O] +
Eqn[n_Eq][l]*Eqn[n_Eq][1] +
Eqn[n_Eq] [2] *Eqn[n_Eq] [2];
if (row_sum > 0.0) {
row_sum= sqrt(row_sum);
Eqn[n_Eq] [O]/=row_sum;
Eqn[n_Eq] [ 1 ]/=row_sum;
Eqn[n_Eq] [2]/=row_sum;
n_Eq++;
}
Eqn[n_Eq] [O] = -mat[2] [0];
Eqn[n_Eq] [ 1 ] = mat[2] [ 1 ];
Eqn[n_Eq][2] = mat[O][O]-mat[l][l];
/* normalize the equation */
row_sum = Eqn[n Eq][O]*Eqn[n_Eq][O] +

SUBSTITUTE SHEET (RUEE 26 W 096/34365 PCTrUS9''056~7 Eqn[n_Eq][l]*Eqn[n_Eq][1] +
Eqn[n_EqJ [2] *Eqn[n_Eq] r2];
if (row_sum > 0.0) {
row_sum = sqrt(row_sum);
Eqn[n_Eq] [O]/qow_sum;
Eqn[n_Eq] [ 1 ]/=row_sum;
Eqn[n_Eq] ~2]/qow_sum;
n_Eq++;
}

Eqn[n_Eq][O] = mat[1][0];
Eqn[n_Eq~ [ 1 ] = mat[2] [2]-mat[O] [O];
Eqn[n_Eq] [2] = -mat[ 1 ] [2];
/* norm~li7e the equation */
row_sum = Eqn~n_Eq][O]*Eqn[n_Eq][O] +
Eqn[n_Eq][1]*Eqn[n_Eq][1] +
Eqn[n_Eq] [2] *Eqn[n_Eq] [2];
if (row_sum > 0.0) {
row_sum = sqrt(row_sum);
Eqn[n_Eq] [O]/qow_sum;
Eqn[n_Eq] [ 1 ]/qow_sum;
Eqn[n_Eq] [2]/qow_sum;
n_Eq++;
}

Eqn[n_Eq][O] = mat[1][1]-mat[2][2];
Eqn[n_Eq][1] = -mat[0][1];
Eqn[n_Eq][2] = mat[O][2];
/* normalize the equation */
row_sum = Eqn[n_Eq][O]*Eqn[n_Eq][O] +
Eqn[n Eq][l]*Eqn[n_Eq][1] +
Eqn[n_Eq] [2] *Eqn[n_Eq] [2];
if (row_sum > 0.0) {
row_sum = sqrt(row sum);
Eqn[n_Eq] [O]/=row_sum;

SUBSTITUTE SHEET (RULE 26 PCTrUS~'056 Eqn~n_Eq] [ 1 ]/=row_sum;
Eqnrn_Eq] [2~/~ow_sum;
n_Eq~;

}

if (removeOutliers) {
/* remove outliers */
nGroups[0] = n_Eq;
Sizes~0]= 1;
if (!removeE~Outliers(Eqn, nCols, nSizes, nGroups, Sizes, minRows, nTrials, SquaredThreshold, &n_Eq new)) {
n_Eq = n_E~new;

}
if (n_Eq > 0) {
/* solve F by Jacobi */
find_soljacobi(Eqn, n_Eq, 3, Epipoles[rf~[i], &error);

} }
}

free_dmatrix(Eqn, 0, 3*size_E_basis*(size_E_basis-l) +

S~ UI~Sh~ (RU~

W 096/34365 PCTrUS96/05697 6*size_E_basis*size_E_basis - 1, 0, 2);
return(TRUE);
}

BOOL ImproveE_basis(INOUT DMAT3 E_basis[4]) {

int i,j,k,l,m;
double **u, **v_tr, *w;
double min_sv, sum;
/* Note that each ofthe 3 4-by-3 matrices E_basis[*][*]rj] is of */
/* rank 2. After applying svd to each of them, ignore the smallest */
/* singular value (which should be 0) in order to improve them */
u = dmatrix(l,4,1,3);
v_tr= dmatrix(l,3,1,3);
w= dvector(l,3);
for (j=0; j<3; j++) {
for (k=0; k<4; k++) {
for (1=0; 1<3; 1++) {
u[l+k][l+l] = E_basis[k][l]~];
}

if (!svdcmp(u, 4, 3, w, v_tr)) {
free_dmatrix(u, 1,4, 1,3);
free_dmatrix(v_tr,l,3,1 ,3);
free_dvector(w, 1 ,3);
return(FALSE);
}

/* find smallest singular value */

SUBSTITUTE SHEET (RULE 26 PCTrUS~G/O~C~7 min_sv = w[ 1 ];
i= l;
for (1=2; 1<=3; 1++) {
if (w[l]<min_sv) {
min_sv = w[l];

}
}

/* replace the E_basis by the product ignoring w[i] */
for (k=0; k<4; k~+) ~
for (1=0; 1<3; 1++) {
sum= 0.0;
for (m=l; m~=3; m++) {
if(m !=i) {
sum += u[l+k][m~*wlm]*v_tr[l+l][m3;
}

}
E_basis[k][l]~j] = sum;
}
}

free_dmatrix(uJ 1,4, 1,3);
free_dmatrix(v_tr, 1,3 ,1,3);
free_dvector(w, 1 ,3);
re~urn(TRUE);
}

FUNCTION
BOOL
removeEqOutliers /* remove outliers for homogeneous system */

~7 -W 096/34365 - PCTnUS~''05~7 ( INOUT double $*Mat, /* the initial equations */
rN int nCols, /* there are nCols columns (unknowns) */
IN int nSizes, /*
* it is assumed that the equations are * org~ni7e~ in groups. These groups * have nSizes different sizes.
*/
IN int nGroups[], /*
* there are nSizes nGroups which mark the * number of groups of each size */
IN int Sizes[], /*
* there are nSizes Sizes which mark the * size of each group.
* e.g.: the first nGroups[0] are groups * each of size Sizes[0], and so on */
IN int minRows, /*
* each iteration, try at least * minRows number of rows */
IN int nTrials, /* number of random trials to perform */
IN double SquaredThreshold, /* for boundary deterrnination */
OUT int *outMatRows /*
* The number of rows in Mat to be * considered. These rows are the first * rows in the output Mat and do not * keep any grouping order */
) SUBSTITUTE SHEET (RULE 2 W O9''3~6~ PCTnUS96/05697 DESCRIPTION
Given the initial equations in Mat, get rid of the outliers and keep the fitting equations as the leading rows of the output Mat.
Perforrn nTrials random trials, each time randomly choose groups so that there are at least minRows equations. Solve this set of n_current_equations by Jacobi and check the solution against ALL equations. Keep track of the best solution (for which the norm of the Mat*sol is minim~l).
Use the best solution to deterrnine outliers (keeping the initial partition of the equations into groups).
Put the groups which fit the best solution as the leading *outMatRows rows of Mat ASSUMPTIONS
The equations are norm~li7erl so that there is no need to consider the equation's no~n when perforrning the outliers removal AUTHOR
Tamir Shalom (April 10 1995) */
{

int i,j,k,l;
int iTrial;
int iRows;
int iGroup;
int maxSize;
int ini~i~lRows; /* how many initial rows */

SUBSTITUTE SHEET (RULE 2R~

W 096/34365 PCTrUS96/05697 int nTotalGroups; /* how many total groups */
int nAvailGroups; /* howmany available groups */
double **subMat;
int *from_row; /*
* starting row in Mat which * corresponds to i'th group * (among all nTotalGroups) $/

int *to_row_plusl; /* ending row .. */
int nrandGroup;
double error, boundary;
double *solution, *best_solution;
double squared_norm, entry, min_error;

/*
* Let maxSize be the maximal size of groups, then * the maximal number of rows that will be used is * minRows+maxSize-l .
*/
t* find maxSize, initialRows and nTotalGroups */
initialRows = nGroups[O]*Sizes[O];
nTotalGroups = nGroups[O];
maxSize= Sizes[O];
for (i=l; i~nSizes; i++) {
initialRows += nGroups[i]*Sizes[i];
nTotalGroups += nGroups[i];
if (Sizes[i] > maxSize) {
maxSize = Sizes[i];
}
}

}UI~J (RULE2~) W 096/34365 PCTAUS9~'0S6~7 if (initialRows < minRows) {
return(FALSE);
}
/* allocate subMatrix */
subMat = dmakix(O, minRows+maxSize-l - 1, O, nCols - l);
/* allocate solution and best_solution */
solution= dvector(O, nCols-l);
best_solution = dvector(O, nCols-l);
/* allocate ~om_row and to_row_plus 1 */
from_row = ivector(O, nTotalGroups-l);
to_row plusl = ivector(O, nTotalGroups-l);
/* perform the trials */
for (iTrial=O; iTrial < nTrials; iTrial++) {
/* initialize from_row and to_row plus 1 */
iRows = O;
iGroup = O;
nAvailGroups = nTotalGroups;
for (i=O; i<nSizes; i I I ) {
for (j=O; j<nGroups[i]; j++) {
from_rowtiGroup] = iRows;
iRows += Sizes[i];
to_row_plus 1 [iGroup] = iRows;
iGroup I l;
}
}

/* iRows denotes the number of already chosen equations */
iRows = O;

~ 71 SUBSTITUTE SHEET (RUL~ ~6 WO g6/3436~, PCTrU~9.'~'0'~7 while (iRows < minRows) {
nrandGroup = randomn(nAvailGroups);
/* move the group to the subMat */
for (i=from_rowtnrandGroup];
i < to_row_plus 1 [nrandGroup];
il 1) {
for (j=O; j < nCols; j~) {
subMat[iRows]G] = Mat[i]rj];
}

iRows++;
}
/* update nAvailGroups, from_row and to_row~lus 1 */
nAvailGroups--;
for (i=nrandGroup; i < nAvailGroups, i~) {
from_row[i] = from_row[i+1];
to_row_plus 1 [i] = to_row_plus 1 [i+1~;
}

/* solve the equations of subMat */
find_soljacobi(subMat, iRows, nCols, solution, &error);
/* estimate the solution against all equations */
squared_norrn= 0.0;
for (i=O; i<initialRows; i++) {
/*
* compute the i'th entry of the resulting product of * Mat by solution */
entry= 0.0;
for (j=O; j<nCols; jt~) ~
entry += Mat[i]G]*s~1Uti~nG];

EEr (~LE 26) W 096/34365 PCTrUS96/05697 squared_norm += entry*entry;
}

if ((iTrial O) 11 (squared_norm < min_error)) {
min_error= squared_norm;
/* printf("iTrial %d error= %g\n", iTrial, squared_norm); */
/* keep the best solution so far */
for (j=O; j<nCols; j I I ) {
best_solutionrj] = solutionrj];

}
}

/* determine boundary for rejection of outliers */
boundary = min_error*SquaredThreshold/(initialRows- 1 );
/* remove outlying groups of equations from Mat */
(*outMatRows) = O;
iRows = O;
iGroup = O;
for (i=O; i<nSizes; i++) {
for (j=O; j<nGroups[i]; j I I ) {
/* accumulate the norm of the whole group */
squared_norm =O .0;
for (k=O; k<Sizes[i]; k++) {
entry = 0.0;
for (1=0; l<nCols; 11 1 ) {
entry += Mat[iRows + k][l]*best_solution[l];
squared_norrn += entry*entry;

SUBSTITUTE SHEET (RULE 26 W O9-'3~ PCTnUS96/05697 iGroup++;
if (squared_norm < Sizesri]*boundary) {
/* add Sizes[i] equations */
/*
printf (" %d of kind %d is taken\n", j, i);
~ _~ */
if ((*outMatRows) iRows) {
/* there is no need to copy the equations */
/* yet (*outMatRows) should be updated */
(*outMatRows) += Sizes[i];
}
else {
for (k=O; k<Sizes[i]; k++) {
for (1=0; l<nCols; 11 1 ) {
Mat[(*outMatRows)][l] = Mat[iRows + k][l];
}

(*outMatRows)l l;
}

}
iRows += Sizes[i];
}
}

free_dmatrix(subMat, O, minRows+maxSize-l - 1, O, nCols - l);
free_dvector(solution, O, nCols-l);
free_dvector(best_solution, O, nCols-l);
free_ivector(from_row, O, nTotalGroups-l);
free_ivector(to_row_plus 1, O, nTotalGroups- 1 );
printf ("after outliers removal only %d out of %d equations\n", (*outMatRows), initialRows);
return(TRUE);
}

SUBSTITUTE SHEET (RULE 26) PCTrUS96105697 APPENDIX B

SU~111u~ (RULE 2~) W 096/34365 1.-1n~'_./05697 .~illlUl;t~iC n ot' U~itl~ irilillearT~n~ /rscclle rCCon.~tr::CIiOn f ~n~ ~hrec vie~
3D points are E~ene-atecl hy ~ p.~cuc30 r~nclt)n~ gencr3lc~r ~nd lhc views ure gener;-lecl hy choosin~ Gllller;l y;lrallleters:
~~hc firsl vicw c~mer~ tr~n~form~tion i~ the 3x4 m~t~ tl 01 where J is the identily ~natrix Thc seco;ld view camer~ tran~form;ltioll i.~ the 3x4 m~trix tR vpl ~vllere ~ is ~ rotativn matrix ~nd ~ p .~ tr~nslt~tioll.
Th~ thirà vie~v i~ ~enel-~t ~ v ~pp vyp] ~llere Rpp i~ a ro;,aticJIl loatri% and vr~p is a lr~nslatio de~ine space pOitltS and im~epoinî~ortl~e ahrce ima~,c~;
gea hack sp~ce pointsillP ~nd inla~c points in p,pp,r)pp get lmage_points():

> - - . _ sulv~ for trilinear tensor fic)m p,pp,p~y and gel back El ,E2,E3 (the thrcc maarices rcpresenting thc lensor) > fnd_ten~or(p,pp,ppp):
.. . . . _ .
aiven the tensor (.~pr..~ntcd by the m~trices El,E2.E3) wecompu~e theepipolargeomctric ~ ..pl esentation of tlle views one ancl ~wo. 'rhe cpipol~lr gec~mctry cc711sists t)~ ~ 3x3 matrix F s~tisfying tlle con~;lraint transposc(E))F + F ~ranspose(Ej)=() nnd ahe cpipolc v' satisfying tllc c~nstlailla F~tv'=0_ fnd_matrlx(E1,E2,~3~:
vp_from_F(F):

> - .
i h~ projectivc coordinate~ of the !~cen~ s ~ti~i3y the Collstrs,inL:
pp = Elp ~ kv' ~n~i uhe resultin~ oooldin~tes ;ire (x,y,I,k) ano~her pro~ective represen~ation is hy (x,y, I ,~) where s satisfie~
pp = ~v ~rp + ~v' where [v'] is lhc ske~v-.cynmlelric matrix :~c50ci ~ x wilh vcctor product.
Wc i~lplemcnt here ol~ly the first Ic~.lcscntalion.
prolP:_matrlx(4,1 0):
for from 1 to 10 do pro, P 1,,]:= p 1,,~:
pro,'P 2, j:= p ,2,,'~:
proJP-3,,'1:=P.3~
proJP ~,, ]:=getk(col(pp,j),~ralm(Et &~ col(p,j3), vp_est):
od:
l'o go ~rom 3 rojcctive lo l~uciidean coordinsltes we us~ five controi point:i.
~ h~r~ are olher way!i to find ~h~: projec-ivc-~o-l uclidcfln trun~i~ormation, but h~re we implement the silnple~ way by lJ~itlgcolltr(~lpoillt get hack malriX ~
proj2E~(proJP,P):

SUBSTITUTE S~!EET (RULE 26 W 096/3436~ PCTAUS36'0SC~7 comp~cprojective and F.~ n J.~,. ~7atrix(4,10,i-~O):
for j from 1 to 10 do P_ha~:=evalm(T &~ col~projP,j)):
P_hat 1~:=P_hat~lllP_hat~4]: P_hat~2]:~P_ha~2]/P-hatt4~:
P_hat,3]:-P_hat[33/P_hatt4~:
P_hat 4]~
for i from 1 to 4 do P~Q,np~i,j3:=eYalm(p-hatri3 - P~i,j33:
od: od:
evalm(Pcomp~;
Collec~ion of subroutincs (p~cedl,res).
with(linal~):
Warning: s~ew definirion for nonn Warnir.g: ~ew de~inition for trace ~nd_tensor:=proc(p,pp,ppp) c(4~,27,i-~0):
j:--~:
for;from 1 to 10 do 1~ ,1 :=evalm(-p[1 ,2 :=evalm~-p~
,3 :=~1;
rr ,7:=evalm~ppp~ P
,8 :-eva1m~ppp 1,i. ~Pr TT ~ ,9 :=eYalm(ppp 1 ,i. ):
sv~lm(pp 1~i ~pE~ J:
1~ ,20 :=eYalm(pp ;l,i ~p~,i]3:
rr - ,21 :=evalm(pp ,1 ,i ):
rr~"2~ :=evalm(- ppp~l,i]~pp~ p[1,i]):
,25 :=sYalm(- ppp~1"~pp~1,i]~p~2,i~):
rr ,27 :=evalm(-pp ~tt ~ ppr1 ~i]):
TT +1,4 :=~valm(-p 1,i3):
rr ~ :=evalm(-p 2,i~);
1 ,6 :=-1:
~ ~ :=evalm(ppp~2,i ~p~1,1]):
t,8 :=evalm(ppp 2,i ~p~2,i]):
rr; ~1 ,g. :=evalm(ppp .2,i ):
1,22~:=evalm~pp 1,i ~p~1.1l):
+1,23]:=evalm pp 1,i ~p~2,i]):
1~ +1,24~:=evalm pp 1,i ):
,2~3 =eYalm(- Pppi2,i~ppt1,i ~p~l,i]) rr~+1,26~:=evalm(- ppp~2,i~pp~ p[2,i]):
1 ,27]:=evalm~-ppp~2,i]~pp~1 ,i~,:

SUBS1~ SH~T ~RULE 26 -W 096/3436~ PCTrUS9G/0~6~7 'j+2,10 :=evalm(-p[1,i]):
j~2,1 1 =evalm(~p[2!i~) 2,12' =-1:
+2,16 :_e~al.~(ppplt,i~pE1,i~):
TT ','+2,17; :_evalm(ppp[1,i]~pL~
TT- 12,18 :=evalm(pp~{1,1]):
rr *2,19 :-~valm~pp 2,1~p[1.il):
rr +2,20 :=evalm(pp 2,i~p[2,i~:
+2,21 :=evalm(pp 2,i]):
1-r +:2,25 :=evalm(- ppp~1,i]~pp[2,i tp~t,~
2,2~ :oevalm(- ppp~1,i]~ppr2,i.~P~2.i]3:
7~r +2,27]:-evalm(-ppp[1,i]~pp~2,i],:

3,13j:=evalm(-p~
7~ ' I 3,14 :=~valm~-p~2.i~):
l~ 3,15 ~
TT; 1 3,t 6 :=~Yalmtppp 2,i ~p~1 ,I]):
3,17 :-~valm(ppp~2,i ~p~2,i3):
rr i l 3,18 :-evalm(ppp 2,i ):
3,22 :=e~alm(pp~2,i Sp~
+3,23 :=evalm(ppt2,i~pr2,i~) TT ~3,24 :-evalm(pp~2,i ):
rr 1 3,2~ :_evalm(- ppp~2,i]*pp~2,iJ~p~1 ,i]~:
3,26.:=~3valm(-ppp~2,il~pp~ p~2,i~):
rT +3,27 :=evalm(-ppp~2,i]~pp~2,i~):
J:=:~4:
od:
b:=-CQI(7'T,27~:
~:=sul~lllatr;,~ ,1..40,1..:26):
sol:=lPastsqrs(TT,b~:
X:-vector(27,i-~1):
for i from 1 to 26 do X[i~:-sol~ d:
alpha:=array(1..3,t..3,1..3):
forifrom1to3do for j from t to 3 do for k from 1 to 3 do alpha~i,j,k3:=X~k+(i 1)~9 ~ 3];
od: od: od:
E1 :-matrix(3,3):
E2~ lri~3,3):

~3:_i.ldtli-c(3,3):

for i from 1 to 3 do for k from 1 to 3 do E1 i,k~:-alpha~1,1,k~:

E2,i,k]:=alphati,2,kl:

E3 i,k~:=alpha~i,3,k~:

od: od:

W 096/3436~ PCTrUS96/05697 ~ W1 :=rnatrix(3,3):
> ~i2:_snatrlx(3,3~:
> W3:=matrlx~3,3);
for j trom l io 3 do ~ for k from 1 to 3 do > W1 [ ,k~:=alpha~1 ,j,k]:
W2t ,k]:=~lpha[2,J,k]:
W3L',k]:-alphai'3,J,k]:
> od: od:
>
> end:
._ . . . _ .. ..
>
> tnd_matrix:=proc(M1 ,M2,M3) ~ Test:=matrix(1 8,9,1-~0):
> Testr1,i~ ol(M1,~ ri];Test[2,h3]:=co l(M1,2)tl];Testir3,i+6]:=CQI[M1~3)[1];
> T~Sti'4~i]:=~ol(M1~2)ti];
> To~t~4,1 1 3]:=col(M1,1)[i];
> Test~5,i]:=col(M1 ,3)ti];
> Test[5,1~8 :=col(M1,1) 1];
~ Test~6,1+3 :=col(M1,3).1];
> Test[6,1+6 -=col(M1,2) 1];
~ Testi7,1]:=col(M2,1 ~i];
> Te8t~8,1+3]:=col(M2,2) ll;
> Te~tt9,1~6i:=col(M2,3),1];
> Test[1 0,1]:=col(M2,2)[1;
> Test[1 0,i~3]:=col(M2,1 )~i];
Test 11,1l:=col(M2,3)[1];
Test 11,1 1 6l:=col(M2,1 )t~l;
~ Test'12,113]:=col(M2,3)~i];
> Te~t~1 2,i~6]:-col(M2,2)~t];
> TQ~t~13~i]~ ol(M3~1)tl];
> Test~1~,1 1 33:=col(M3,2)~i];
> Testl16,i 1 6]:=col~M3,3)[1~;
> Test[1 6,i]:=col(M3,2)t;];
> Test~1 6,i~3]:-col(M3,1 )[1l;
> Test~17,i~:=col(M3,3)~1];
Test~1 7,i~6 :-col(M3,1 ) I;
> T~st~18,113 :=col(M3,3) I;
> Testt18,i+6 :=col(M3,2) i.;
, od:
~ b:=-col(T~t,~):
> Te~t:=~;ubmatrlx(Test,1..1~,1..8):
~ ~ol:=le~stsqr~(Te3t,b):
> X:=vector(9,i->1 ):

SUBST~ HEET ~RU~ 26 W 096/34365 P~l/u~_. 056~7 ~ for i from 1 to 8 do X~i]:=sol[i~; od:
> F:=matrix(3,3):
, for i from 1 to 3 do > for ~ from 1 to 3 do F~ =X[i+ (~ 3];
> od: od:
, end:

~kew cross:-proc(w~
~tack(eYalm(crossprod([1 ,O,O],w)~,stack(evalm(crossprod(~0,1 ,QLw)),eY~lm(cro -~8prod(~0,D,1 ],w))));
> end:
.. ..
gelepipole~rom F~tv'=V
> vp_from_F:=proc(F) :. G:=transpo~e(F):
:~ b:=-col(G,3):
G:=~ubmatrix(G,1..3,1..2):
sol~ astsqrs((;,b):
vp_e~t:=v~ctor(3,i~
estE1]:=soll1]: vp_est[2]:=sol[2~:
> end: ~

> ~
t-lm2ge-point~ =proc() > r~ndn:=rand(1..1 00):
~, R:=rot_mat(0.9,[0.3,0.5Øg~):
> Rp:=rot_m~t(O.9,t2.3,0~7,1~1}): -.
> vpp:-~ector(3,[1.1,3.3,1~0~):
> ~/p:=vector(3,~0.5,-2.2,1.0]):
P:=matrix(4,10):
> for I from t to 3 do for j from 1 to 10 do P~ a~dt~ l O;
> od: od:
> for I from t to 10 do ~ P14,j]:=1;
> od:
> A:=augment(R,vp):
> ~3:=augmnnt(Rp,vpp):
~ p:=evalm(submatrix(P,1,.3,1..10)):
> pp:=evalm(A &~ P):

SUBST~TE S~tF~ ULE 2 CA 022l93l4 l997-l0-24 W O9f'3~
P~l/U~ 5C~7 ~, ppp:-eyalln(~ &'P?:
> for j ~roln 1 ~o ~ ~
~ for i from 5 ;o 2 do > pli,j]:=p~ 3/pr3,jl; pp[i,j]:=ppL;,ji/pPI3~; PPPti~i]:=PPP[i~J~/PPP[3~il;
od: od:
for jfrom1tolOdo P[3,i~ : pp[3,j]:=1: ppp[3,j]:=1;
od:
~nd:
. _ , ........... . _ _ .. . .

tran!;fnrr~l back ~
Wc need ~ Ic~st S poinl~.
proJ2Euc:=proc(projP,P) T:-matrix(1 ~,1 6,i-~0):
for 3 from t to 5 do i:=1 +(j-1)-3:
T~i,1 :=col(P, ) 4 'col(pro P, )[1];
Tti,2 :=col~P, ) 4 'col(pro P, )l2~:
T[i,3 :=col(P, ) 4 ~col~pro P, )[3]:
T~i,4]:=col(P, ) 4l~col(projP~i)E4:
T[i,13 :=-col(', ) 1 'col(pro P,1) 1~;
T[1,14 :=-col(P, ) 1 'col~pro P,J) 2~:
T[i,15 :_-col(P, ) 1 ~col(pro P,1) 3]:
T~ col(P, ) 1 ~col(pro P,¦) 4]:
>

Tli+1 ,53:-c~l(p~J)[4l~col(proJp~J)t1 ]:
T[i+1,6]:=col(P, )t4]~col(projP,j)~2]:
T{i+1,73:=col(P, )[4]~col(projP,j)[3]:
Trl+1,8]:=col(P~ )[4~-col(projP,l)t4]:
T~i+1,13~ col(~, )[2]~col(projP, ) 1:
T~i+1,14~ col(P, )[2~tcol(projP. ).2:
T[i~ ]:=-col(P, )E2]~col(projP,)3.:
Tti+1,16~ col(P, )t2]~col(projP, ) 4. :
~ .
> T[i~2,~]:-col(P,j)t4]~col(projP~j)[1]:
T[i+2,10~ oi(P,j)[4]~col(projP, ) 2:
Tti+2,11~:=col( P,j)l4]~col(projP, ) 3:
T[1+2,12]:=col(P,l)t~]~col(prolP~ ) 4:
T[i~2,13~ col(P,j)~3]'col(projP,j)[1 l:
> T~i+2~14l:=-col(p~i)[3ltcol(proip~ 2]:
> T~1+~,16~ col(P,j)[3l~col(projP,j)[3~:
> T~i+2,t 6]:=-col(p~j)[3]'col(projp~l)[4]:
, c~d:
>

SU~ E 26) W 096/34365 PCTrUS9r'0~6~7 b:=-col(T,1 6):
T:=~ubm~trlx(T,1..15,1..1~):
~o3:_1in~olve(T,b):
X:-ve~,o. ~
for i from 1 to 1~ do X[l~:=sol[i~; od:
T:=matrix(4,4):
for i from 1 to 4 do for J from 1 to 4 do T~i,j]:=XU+(i 1)~4~:
od: od:
end:
>

.. ~
> getk:=proc(~,b,c3 evalm((transpose(cros~prod(a,b)) ~
crossprod(c,a))/(transpose(cros~;prod(c,a)) &* crossprod(c,a)));
nd:

.
rot_mat:=proc(angle,w) w_unit:=evalm~(1/norm(w,frobenius))~w);
WW:=evalm(w_unit &~ tr~nspose(w_unit)):
cos_mat:=evalm(co~(angle) ~ array(identlty,1..3,1..3));
W_ anti~iym:=stack(evalm(cross~rc.~ 1 ,O,Ol,w_unlt)),stack(evalm(crt~ssprod(rO, 1 ,O~,w_unit)),evalm(crossprod([0,0,1 3,w_unit))));
evalm(cos_mat + sin(an~le)~W_antisym + ~1-cos~angle))~WW);
end:
cross_mat:=proc(w) evalm(stack(evalm(cro3sprod(~t,0,0~,w_unit)),stack(eYalm(crossprod~{0,1 ,O],w _unit)),evalm(crossprod(tO,Q,1 ],w_unit)))~);
end:

SUBSTITUTE ST IEET (RULE 26)

Claims

1. A method for generating information regarding a 3D
object from at least one 2D projection thereof, the method comprising:
providing at least one 2D projection of a 3D object;
generating an array of numbers described by:
.alpha.ijk = Vi' bjk ~ Vj" aik (i,j,k = 1,2,3), where aij and bjk are elements of matrices A and B respectively and vi' and vi" are elements of vectors v' and v" respectively, wherein said matrices and vectors together describe camera parameters of three views of the 3D object; and employing the array to generate information regarding the 3D object.

2. A method for generating a new view of a 3D object from first and second existing views thereof having n corresponding points Pi and Pi' (i = 1 ... n), the method comprising:
generating a tensor .alpha.ijk; and plugging the values of points Pi and Pi' (i = 1 ... n) into trilinear forms, thereby to extract an x", y" value representing a location in the new view; and generating the new view on the basis of the result of the plugging in step.

3. A method for reconstructing a 3D object from at least one 2D projection thereof, the method comprising:
providing at least one 2D projection of a 3D object;
generating an array of numbers described by:
.alpha.ijk = Vi' bjk - vj" aik (i,j,k = 1,2,3), where aij and bjk are elements of matrices A and B respectively and vi' and vi" are elements of vectors v' and v" respectively, wherein said matrices and vectors together describe camera parameters of three views of the 3D object;
permuting the array into three homography matrices associated with three corresponding planes in 3D space; and employing the three homography matrices to reconstruct the 3D object.

4. A visual recognition method comprising:
providing three perspective views of a 3D object between which a trilinear relationships exists; and employing the trilinear relationship between the views in order to perform visual recognition by alignment.

5. A method according to claim 4 and also comprising reprojecting the 3D object.

6. A method according to claim 1 wherein said information regarding the 3D object comprises a reconstruction of the 3D
object.

7. A method according to claim 1 wherein said information regarding the 3D object comprises at least one new view of the 3D
object generated without reconstructing the 3D object.

8. A method according to claim 1 wherein said at least one 2D projection comprises at least one aerial photograph.

9. A method according to claim 1 wherein said at least one 2D projection comprises at least one satellite photograph.

10. A method according to claim 1 wherein said information regarding the 3D object comprises at least one coordinate of the 3D object.

11. A method according to claim 1 wherein the 3D object comprises an aerospace object.

12. A method according to claim 1 wherein the 3D object comprises a large object such as a ship.

13. A method according to claim 1 wherein the 3D object comprises a nonexistent object.

14. Apparatus for generating information regarding a 3D
object from at least one 2D projection thereof, the apparatus comprising:
apparatus for providing at least one 2D projection of a 3D object;
an array generator operative to generate an array of numbers described by:
.alpha.ijk = Vi' bjk - Vj" aik (i,j,k = 1,2,3), where aij and bjk are elements of matrices A and B respectively and vi' and vi" are elements of vectors v' and v" respectively, wherein said matrices and vectors together describe camera parameters of three views of the 3D object; and an array analyzer employing the array to generate information regarding the 3D object.

15. Apparatus for generating a new view of a 3D object from first and second existing views thereof having n corresponding points Pi and Pi' (i = 1 ... n), the apparatus comprising:
apparatus for generating a tensor .alpha.ijk; and apparatus for plugging the values of points Pi and Pi' (i = 1 ... n) into trilinear forms, thereby to extract an x", y"
value representing a location in the new view; and apparatus for generating the new view on the basis of the result of the plugging in step.

16. Apparatus for reconstructing a 3D object from at least one 2D projection thereof, the apparatus comprising:
apparatus for providing at least one 2D projection of a 3D object;
an array generator operative to generate an array of numbers described by:
.alpha.ijk = vi' bjk - vj" aik (i,j,k = 1,2,3), where aij and bjk are elements of matrices A and B respectively and vi' and vi" are elements of vectors v' and v" respectively, wherein said matrices and vectors together describe camera parameters of three views of the 3D object;
an array permutator operative to permute the array into three homography matrices associated with three corresponding planes in 3D space; and a 3D object reconstructor operative to employ the three homography matrices to reconstruct the 3D object.

17. Visual recognition apparatus comprising:
apparatus for providing three perspective views of a 3D
object between which a trilinear relationships exists; and apparatus for employing the trilinear relationship between the views in order to perform visual recognition by alignment.

18. A method according to claim 1 wherein said at least one 2D projection comprises three 2D projections.

19. Apparatus according to claim 14 wherein said at least one 2D projection comprises three 2D projections.

20. A method according to claim 1 and also comprising employing a result of said method in order to perform coordinate measurement of industrial parts.

21. A method according to claim 1 and also comprising employing a result of said method in order to perform automated optical based inspection of industrial parts.

22. A method according to claim 1 and also comprising employing a result of said method in order to perform robotic cell alignment.

23. A method according to claim 1 and also comprising employing a result of said method in order to perform robotic trajectory identification.

24. A method according to claim 1 and also comprising employing a result of said method in order to perform 3D robotic feedback.

25. A method according to claim 1 and also comprising employing a result of said method in order to perform 3D
modelling of scenes.

26. A method according to claim 1 and also comprising employing a result of said method in order to perform 3D
modelling of objects.

27. A method according to claim 1 and also comprising employing a result of said method in order to perform reverse engineering.

28. A method according to claim 1 and also comprising employing a result of said method in order to perform 3D
digitizing.

29. A 3D scene reconstruction method for generating a 3D
representation of a 3D scene from first, second and third views thereof, the method comprising:
providing first, second and third views of a 3D scene;
employing geometric information regarding said first, second and third views to generate a trilinear tensor representing the geometric relationship between the first, second and third views; and generating a 3D representation of the 3D scene from said trilinear tensor.

30. A method according to claim 29 wherein said step of generating a 3D representation comprises:
computing an epipolar geometric representation of the first and second views from the trilinear tensor; and generating said 3D representation from said epipolar geometric representation.

31. Image transfer apparatus for generating a novel view of a 3D scene from first and second reference views thereof, the apparatus comprising:
apparatus for providing first and second reference views of a 3D scene;
a trilinear tensor generator operative to employ geometric information regarding said first reference view, second reference view and novel view, respectively, to generate a trilinear tensor representing the geometric relationship between the first, second and novel views; and a novel view generator operative to generate said novel view by computing a multiplicity of novel view locations each corresponding to different first and second corresponding locations in said first and second reference views respectively based on said first and second corresponding locations and said trilinear tensor.

32. 3D scene reconstruction apparatus for generating a 3D

representation of a 3D scene from first, second and third views thereof, the apparatus comprising:
appparatus for providing first, second and third views of a 3D scene;
a trilinear tensor generator operative to employ geometric information regarding said first, second and third views to generate a trilinear tensor representing the geometric relationship between the first, second and third views; and a 3D scene representation generator operative to generate a 3D representation of the 3D scene from said trilinear tensor.

33. A method according to claim 29 wherein said providing step comprises providing m > 3 views of the 3D scene and wherein said employing step is performed for each of a plurality of subsets of 3 views from among the m views, thereby to generate a plurality of intermediate tensors, the method also comprising, prior to said step of generating, combining the plurality of intermediate tensors into a final trlinear tensor and wherein said step of generating comprises generating a 3D representation of the 3D scene from said final trlinear tensor.

34. A method according to claim 33 wherein said employing step is performed for the first and second views in combination with each of the remaining views, thereby to generate m-2 intermediate tensors.

35. A method according to claim 33 wherein said step of combining comprises:
arranging each of the intermediate tensors within a corresponding 9 x (3 (m-2)) matrix; and defining the final trilinear tensor as the four largest principal components of said matrix.

36. Matching point finding apparatus for finding a multiplicity of matching locations across three views, the apparatus comprising:
providing an initial set of matching locations across three views;
generating a trilinear tensor representing the relationships between said three views, based on said initial set;
and employing the trilinear tensor to generate a final set of matching locations.

37. Apparatus according to claim 36 wherein said step of employing comprises:
for each of a multiplicity of locations within a first of the three views:
generating first and second corresponding epipolar lines from the tensor which are associated with the remaining two views respectively;
selecting a first candidate matching location along the first epipolar line and computing a second candidate matching location along the second epipolar line based on the first matching location; and repeating the selecting step until the two candidate matching locations and the first view location match.

38. A method according to claim 29 and also comprising:
performing surface interpolation on the 3D representation, thereby to generate a surface interpolated output; and generating an orthophoto from the surface interpolated output.

39. A method according to claim 29 wherein said step of providing comprises sequentially positioning at least three imaging devices at a sequence of positions within the scene and imaging at least a portion of the scene, using each of the imaging devices, at each of the positions within the sequence.

40. A method according to claim 29 and also comprising comparing the 3D representation of the 3D scene to a stored model of the 3D scene.