JP2006310936A

JP2006310936A - System for generating video image viewed at optional viewpoint

Info

Publication number: JP2006310936A
Application number: JP2005127639A
Authority: JP
Inventors: Kenichiro Yamamoto; 健一郎山本; Masayuki Nakazawa; 正幸中沢; Michiaki Mukai; 理朗向井
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 2005-04-26
Filing date: 2005-04-26
Publication date: 2006-11-09

Abstract

<P>PROBLEM TO BE SOLVED: To provide a technology of tracing a particular object at all times, generating its video image and providing a video image resulting from viewing the object from an easy to see position by automatically and optionally controlling a viewpoint position and a sight line direction. <P>SOLUTION: Video information items from a multiple-lens imaging section 1 comprising many cameras located at different viewpoint positions and different sight line directions are composed into a video image in an optional viewpoint position and an optional sight line direction. An object position and direction detection section 2 detects the position and the direction of the object and a virtual camera position and direction calculation section 3 calculates the viewpoint position and the sight line direction of a virtual camera with respect to the object. Further, a video generating section 4 uses multiple-lens video data from the multiple-lens imaging section 1 to generate an image according to the information about the viewpoint position and the sight line direction of the virtual camera transmitted from the virtual camera position and direction calculation section 3. <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、任意視点映像生成システム、より詳細には、それぞれ異なる視点位置及び視線方向に配置した多数のカメラによる映像情報から、実際にはカメラの存在しない視点位置及び視線方向での映像すなわち仮想カメラの視点位置及び視線方向における映像を生成する任意視点映像生成システムに関する。 The present invention relates to an arbitrary viewpoint video generation system, and more specifically, video from a plurality of cameras arranged at different viewpoint positions and line-of-sight directions. The present invention relates to an arbitrary viewpoint video generation system that generates a video at a camera viewpoint position and a line-of-sight direction.

多くの研究機関・大学等で、実際には存在しない仮想カメラを想定し、該仮想カメラを用いて被写体を所望の任意視線方向及び任意視点位置から見た映像を生成する任意視点映像生成技術が研究されている。これは、あるシーン（被写体）を現実の多数のカメラを用いて撮影し、その映像データを処理することで、実際にはカメラの存在しない地点に仮想カメラがあるものと想定し、あたかも、この仮想カメラが撮影したように映像を生成する技術である。例えば非特許文献１や非特許文献２のような技術を用いれば、利用者が仮想カメラの視点位置や視線方向を操作して、特定被写体をずっと追跡して映像を見ることが可能である。スポーツ中継を例に採ると、自分のひいきの選手を常に追跡して見ることができる。
一方、移動する選手をカメラで撮影する場合に、自動的にカメラ方向を制御して選手を追跡する自動追跡技術がある（例えば、特許文献１参照）。
特開２００４−８０１６３号公報北原，佐藤，大田，“多眼ステレオ法を用いた運動視差の再現可能な３次元画像表示―表示画像の生成と評価―”，テレビジョン学会誌，Ｖｏｌ.５０，Ｎｏ.９，ｐｐ．１２６８ − １２７６（１９９６）プリム，藤井，谷本：“自由視点テレビのためのリアルタイムシステム”，信学技報，ＩＥ２００２−１２０（２００２）稲本，斎藤：“視点位置の内挿に基づく３次元サッカー映像の自由視点観賞システム”，映像情報メディア学会誌，Ｖｏｌ．５８，Ｎｏ．４，ｐｐ．５２９ − ５３９（２００４）木村，斎藤：“テニスの多視点画像からのプレイヤー視点映像の生成法”，電子情報通信学会総合大会，Ｄ−１２−１７２（２００４）大谷，岸野：“遺伝的アルゴリズムを用いた多眼画像からの人物の姿勢のモデルベース推定”，映像情報メディア学会誌，Ｖｏｌ．５１，Ｎｏ．１２，ｐｐ．２１０７ − ２１１５（１９９７）湊，斎藤：“多視点サッカー映像を用いた選手の動作解析”，電子情報通信学会総合大会，Ｄ−１２−１２３（２００４） Many research institutes / universities, etc. have an arbitrary viewpoint video generation technology for generating a virtual camera that does not actually exist and generates an image in which a subject is viewed from a desired arbitrary gaze direction and arbitrary viewpoint position using the virtual camera. It has been studied. This is because a scene (subject) is shot using a large number of real cameras, and the video data is processed, assuming that there is actually a virtual camera at a point where the camera does not exist. This is a technology for generating video as if it were taken by a virtual camera. For example, if a technique such as Non-Patent Document 1 or Non-Patent Document 2 is used, the user can manipulate the viewpoint position and line-of-sight direction of the virtual camera to watch the video while tracking the specific subject. Taking sporting sports as an example, you can always track and watch your favorite players.
On the other hand, when a moving player is photographed with a camera, there is an automatic tracking technology that automatically controls the camera direction to track the player (see, for example, Patent Document 1).
JP 2004-80163 A Kitahara, Sato, Ota, “3-D image display capable of reproducing motion parallax using the multi-eye stereo method—generation and evaluation of display image”, Television Society Journal, Vol. 1268-1276 (1996) Primm, Fujii, Tanimoto: "Real-time system for free viewpoint TV", IEICE Technical Report, IE2002-120 (2002) Inamoto, Saito: “Free viewpoint viewing system for 3D soccer video based on viewpoint interpolation”, Journal of the Institute of Image Information and Television Engineers, Vol. 58, no. 4, pp. 529-539 (2004) Kimura, Saito: “Method for generating player viewpoint video from multi-viewpoint images of tennis”, IEICE General Conference, D-12-172 (2004) Otani, Kishino: “Model-based estimation of human postures from multi-view images using genetic algorithms”, Journal of the Institute of Image Information and Television Engineers, Vol. 51, no. 12, pp. 2107-2115 (1997) Kaoru, Saito: “Analysis of player motion using multi-view soccer video”, IEICE General Conference, D-12-123 (2004)

上述のごとき任意視点映像生成技術を用いれば、特定の選手を常に追跡して見ることが可能であるが、利用者は、選手の移動に合わせて常に仮想カメラの視点位置や視線方向を調節する必要があり、利用者への負担が大きいという問題がある。負担を軽減するためには、自動的に選手を追跡できる機能があることが望ましい。 If the arbitrary viewpoint video generation technology as described above is used, it is possible to always track and watch a specific player, but the user always adjusts the viewpoint position and line-of-sight direction of the virtual camera according to the movement of the player. There is a problem that it is necessary and the burden on the user is large. In order to reduce the burden, it is desirable to have a function that can automatically track players.

しかし、上記特許文献１で挙げた自動追跡技術では、カメラ位置は固定であり、カメラ位置と選手の位置の関係によっては、例えば、選手がカメラから遠ざかると選手の映像が小さくなる、または、選手の背中しか撮影できなくなるなど、選手について所望の映像が得られない場合があるという問題がある。 However, in the automatic tracking technology described in Patent Document 1, the camera position is fixed, and depending on the relationship between the camera position and the position of the player, for example, when the player moves away from the camera, the video of the player becomes smaller, or the player There is a problem that a desired image cannot be obtained for the player, such as being able to shoot only the back of the player.

本発明は、上記の問題点を解決するために、任意視点映像生成の際に、下記の２点を実現するものである。
（１）特定選手を常に追跡して、仮想カメラの視点位置及び視線方向での映像を生成する。
（２）選手の位置や向きに応じて自動的に仮想カメラの視点位置及び視線方向を制御し、選手を見やすい位置から見た映像を提供する。
例えば、常に選手の正面で一定の距離になる位置及び方向に仮想カメラを自動的に移動させて、該仮想カメラの視点位置及び視線方向での映像を生成することを実現する。 In order to solve the above problems, the present invention realizes the following two points when generating an arbitrary viewpoint video.
(1) A specific player is always tracked, and an image in the viewpoint position and the line-of-sight direction of the virtual camera is generated.
(2) Automatically control the viewpoint position and line-of-sight direction of the virtual camera in accordance with the position and orientation of the player, and provide an image viewed from a position where the player can easily see.
For example, it is possible to automatically move the virtual camera to a position and a direction that are always a fixed distance in front of the player, and to generate an image in the viewpoint position and the line-of-sight direction of the virtual camera.

本発明の第１の手段は、ある被写体を一つ以上の視点位置及び視線方向から撮影した映像から任意の仮想カメラの視点位置及び視線方向における映像を合成あるいは選択する映像システムにおいて、所定の条件を満たす仮想カメラの視点位置及び視線方向を算出する算出手段を備えたことを特徴とするものである。 According to a first aspect of the present invention, there is provided a video system that synthesizes or selects a video in a viewpoint position and a gaze direction of an arbitrary virtual camera from a video obtained by photographing a certain subject from one or more viewpoint positions and a gaze direction. And a calculation unit that calculates the viewpoint position and the line-of-sight direction of the virtual camera that satisfies the above.

本発明の第２の手段は、前記第１の手段において、一つ以上の被写体の位置及び向きを検出する手段を備え、該検出手段によって得られた一つ以上の被写体の位置及び向きの情報を用いて前記所定の条件を満たす仮想カメラの視点位置及び視線方向を算出することを特徴とするものである。 A second means of the present invention comprises means for detecting the position and orientation of one or more subjects in the first means, and information on the position and orientation of one or more subjects obtained by the detecting means. The viewpoint position and the line-of-sight direction of the virtual camera satisfying the predetermined condition are calculated using.

本発明の第３の手段は、前記第２の手段において、前記所定の条件は、被写体に対して仮想カメラが特定の方向に位置しかつ該仮想カメラの視線方向が被写体の方向を向くことであることを特徴とするものである。 According to a third means of the present invention, in the second means, the predetermined condition is that the virtual camera is positioned in a specific direction with respect to the subject and the line-of-sight direction of the virtual camera faces the direction of the subject. It is characterized by being.

本発明の第４の手段は、前記第２の手段において、前記所定の条件は、仮想カメラの視線方向を固定して視点位置を変え、被写体を仮想カメラの視点位置及び視線方向から常に撮影できることであることを特徴とするものである。 According to a fourth means of the present invention, in the second means, the predetermined condition is that the visual point direction of the virtual camera is fixed and the viewpoint position is changed, and the subject can always be photographed from the viewpoint position and the visual line direction of the virtual camera. It is characterized by being.

本発明の第５の手段は、前記第２乃至第４の手段において、前記所定の条件は、複数の被写体を仮想カメラの視点位置及び視線方向における映像内に収めることであることを特徴とするものである。 According to a fifth means of the present invention, in the second to fourth means, the predetermined condition is that a plurality of subjects are included in an image in a viewpoint position and a line-of-sight direction of the virtual camera. Is.

本発明の第６の手段は、前記第２の手段において、前記所定の条件は、被写体の視点位置及び視線方向と仮想カメラの視点位置及び視線方向が一致することであることを特徴とするものである。 According to a sixth means of the present invention, in the second means, the predetermined condition is that the viewpoint position and the line-of-sight direction of the subject coincide with the viewpoint position and the line-of-sight direction of the virtual camera. It is.

本発明の第７の手段は、前記第２乃至第６の手段において、被写体の位置及び方向の情報に対し不感帯域を設けた処理を行うことを特徴とするものである。 The seventh means of the present invention is characterized in that in the second to sixth means, processing is performed in which a dead band is provided for information on the position and direction of the subject.

本発明の第８の手段は、前記第２乃至第７の手段において、被写体の位置及び方向の情報に対しローパスフィルタ処理を加えることを特徴とするものである。 The eighth means of the present invention is characterized in that in the second to seventh means, low-pass filter processing is applied to information on the position and direction of the subject.

本発明の第９の手段は、前記第２乃至第８の手段において、一つ以上の被写体の位置及び向きを検出する検出手段は、一つ以上の被写体を一つ以上の視点位置及び視線方向から撮影した多眼映像を用いて検出する検出手段であることを特徴とするものである。 According to a ninth means of the present invention, in the second to eighth means, the detecting means for detecting the position and orientation of one or more subjects is one or more subjects for one or more viewpoint positions and line-of-sight directions. It is a detection means to detect using the multi-view video image | photographed from this, It is characterized by the above-mentioned.

本発明の第１０の手段は、前記第２乃至第８の手段において、前記一つ以上の被写体の位置及び向きを検出する検出手段は、一つ以上の被写体を上方より撮影した映像を用いて検出する検出手段であることを特徴とするものである。 According to a tenth means of the present invention, in the second to eighth means, the detecting means for detecting the position and orientation of the one or more subjects uses an image obtained by photographing one or more subjects from above. It is the detection means to detect, It is characterized by the above-mentioned.

本発明の第１１の手段は、前記第２乃至第８の手段において、前記一つ以上の被写体の位置及び向きを検出する検出手段は、被写体自体に該被写体の位置及び向きを検出できるセンサを有し、該センサにより取得した該被写体の位置及び向きの情報を用いて検出する検出手段であることを特徴とするものである。 According to an eleventh means of the present invention, in the second to eighth means, the detecting means for detecting the position and orientation of the one or more subjects is a sensor capable of detecting the location and orientation of the subject on the subject itself. And detecting means for detecting using the information on the position and orientation of the subject acquired by the sensor.

本発明の第１２の手段は、前記第２乃至第８の手段において、請求項９乃至１１の手段うちの、任意の検出手段を組み合わせた検出手段を有することを特徴とするものである。 A twelfth means of the present invention is characterized in that, in the second to eighth means, a detection means in which any detection means of the means of claims 9 to 11 is combined.

本発明の第１３の手段は、前記第１乃至第１２の手段において、前記算出手段による算出結果を用い、前記ある被写体を一つ以上の視点位置及び視線方向から撮影した映像をもとに、前記仮想カメラの視点位置及び視線方向における映像を合成する任意視点映像生成技術を用いることを特徴とするものである。 According to a thirteenth means of the present invention, in the first to twelfth means, based on an image obtained by photographing the certain subject from one or more viewpoint positions and line-of-sight directions, using the calculation result by the calculating means. An arbitrary viewpoint video generation technique for synthesizing videos in the viewpoint position and the line-of-sight direction of the virtual camera is used.

本発明の第１４の手段は、前記第１乃至第１２の手段において、前記算出手段による算出結果を用い、前記ある被写体を一つ以上の視点位置及び視線方向から撮影した映像から任意の仮想カメラの視点位置及び視線方向における映像に最も類似した映像を選択し、該選択した映像を任意の仮想カメラの視点位置及び視線方向における映像とすることを特徴とするものである。 According to a fourteenth aspect of the present invention, in any one of the first to twelfth means, an arbitrary virtual camera is used from a video obtained by photographing the certain subject from one or more viewpoint positions and line-of-sight directions, using the calculation result by the calculation means. A video that is most similar to the video at the viewpoint position and the line-of-sight direction is selected, and the selected video is used as an image at the viewpoint position and the line-of-sight direction of an arbitrary virtual camera.

本発明の第１５の手段は、前記第１乃至第１４の手段において、ある被写体を一つ以上の視点位置及び視線方向から撮影した映像を所定時間遅延させ、遅延させた映像をもとに任意の仮想カメラの視点位置及び視線方向における映像を合成し、あるいは、遅延させた映像から任意の仮想カメラの視点位置及び視線方向における映像に最も類似した映像を選択し、任意の仮想カメラの視点位置及び視線方向における映像とすることを特徴とするものである。 According to a fifteenth means of the present invention, in the first to fourteenth means, an image obtained by photographing a certain subject from one or more viewpoint positions and line-of-sight directions is delayed for a predetermined time, and an arbitrary image is obtained based on the delayed image. The virtual camera's viewpoint position and line-of-sight direction video are synthesized, or the video image most similar to the virtual camera's viewpoint position and line-of-sight direction image is selected from the delayed video, and the virtual camera's viewpoint position is selected. And an image in the line-of-sight direction.

本発明の第１６の手段は、前記第１５の手段において、ある被写体を一つ以上の視点位置及び視線方向から撮影した映像を所定時間遅延させ、仮想カメラの視点位置及び視線方向の算出を前記映像より相対的に先に行うことを特徴とするものである。 According to a sixteenth means of the present invention, in the fifteenth means, a video obtained by photographing a subject from one or more viewpoint positions and line-of-sight directions is delayed for a predetermined time, and calculation of the viewpoint position and line-of-sight direction of the virtual camera is calculated. It is characterized by being performed relatively earlier than the video.

本発明の第１７の手段は、前記第１乃至第１６の手段において、仮想カメラの視点位置及び視線方向において生成した映像を出力する手段を備えることを特徴とするものである。 The seventeenth means of the present invention is characterized in that, in the first to sixteenth means, means for outputting a video generated in the viewpoint position and the line-of-sight direction of the virtual camera is provided.

本発明の第１８の手段は、前記第１７の手段において、前記生成した映像を出力する手段として、２次元表示のディスプレイを用いることを特徴とするものである。 According to an eighteenth means of the present invention, in the seventeenth means, a two-dimensional display is used as the means for outputting the generated video.

本発明の第１９の手段は、前記第１７の手段において、前記生成した映像を出力する手段として、立体表示ディスプレイを用いることを特徴とするものである。 According to a nineteenth means of the present invention, in the seventeenth means, a stereoscopic display is used as the means for outputting the generated video.

本発明の第２０の手段は、前記第１乃至第１９の手段において、前記生成した映像内の被写体が逆光になる条件を検出する手段を備え、前記生成した映像内の被写体が逆光にならないことを前記所定の条件とすることを特徴とするものである。 According to a twentieth means of the present invention, in the first to nineteenth means, there is provided means for detecting a condition in which the subject in the generated video is backlit, and the subject in the generated video is not backlit. Is defined as the predetermined condition.

本発明の第２１の手段は、前記第２０の手段において、ある被写体を一つ以上の視点位置及び視線方向から撮影した映像を用いて逆光の検出を行うことを特徴とするものである。 The twenty-first means of the present invention is characterized in that, in the twentieth means, backlight detection is performed using an image obtained by photographing a certain subject from one or more viewpoint positions and line-of-sight directions.

本発明の映像システムによると、被写体が移動しても、該被写体の移動に合わせて仮想カメラの視点位置及び視線方向を自動的に変化させ、常に、一定の見やすい相対位置から被写体を観察することが可能である。
また、複数の被写体がある場合に、常にそれらが画面内に入るよう仮想カメラの視点位置及び視線方向を自動的に変化させることが可能である。
また、被写体位置及び向きと仮想カメラの視点位置及び視線方向が一致するよう自動的に変化させることで、被写体が見ている光景を合成することが可能である。
また、被写体位置及び向きの情報に対して不感帯域処理を加えることで、被写体位置及び向きの情報が細かく変化した場合でも、あるレベルα以下の細かい変化を無視することができ、仮想カメラの位置又は／及び方向が細かく変化しすぎることを防ぎ、安定した画像を生成することを可能とする。そのため、利用者が映像に酔うことを防止できる。
また、被写体位置及び向きの情報に対してローパスフィルタ処理をかけることが可能である。該処理により被写体位置及び方向の情報の高周波成分を取り除き、仮想カメラの位置が細かく変化しすぎることを防ぎ、安定した画像を生成することを可能とする。そのため利用者が映像に酔うことを防止できる。
また、映像を遅延させることで被写体位置及び向きの情報の処理を相対的に先読み処理することで、被写体の未来の動きを先取りして仮想カメラ位置及び方向の制御を行うことが可能である。
また、生成画面が逆光状態であるかどうかを判定することで、逆光状態になる仮想カメラの視点位置及び視線方向は避けるよう、仮想カメラの視点位置及び視線方向を制御することが可能である。 According to the video system of the present invention, even if the subject moves, the viewpoint position and the line-of-sight direction of the virtual camera are automatically changed in accordance with the movement of the subject, and the subject is always observed from a certain easy-to-see relative position. Is possible.
In addition, when there are a plurality of subjects, it is possible to automatically change the viewpoint position and the line-of-sight direction of the virtual camera so that they always enter the screen.
In addition, by automatically changing the subject position and orientation to match the viewpoint position and line-of-sight direction of the virtual camera, it is possible to synthesize a scene viewed by the subject.
In addition, by applying dead band processing to the subject position and orientation information, even if the subject position and orientation information changes finely, fine changes below a certain level α can be ignored, and the position of the virtual camera Alternatively, it is possible to prevent the direction from being changed too much and generate a stable image. Therefore, it is possible to prevent the user from getting drunk with the video.
Moreover, it is possible to apply low-pass filter processing to information on the subject position and orientation. By this processing, the high frequency component of the information on the subject position and direction is removed, the virtual camera position is prevented from changing too finely, and a stable image can be generated. Therefore, it is possible to prevent the user from getting drunk on the video.
Also, by delaying the video and relatively pre-reading the processing of the subject position and orientation information, it is possible to control the position and direction of the virtual camera in advance of the future movement of the subject.
Further, by determining whether or not the generated screen is in the backlight state, the viewpoint position and the line-of-sight direction of the virtual camera can be controlled so as to avoid the viewpoint position and the line-of-sight direction of the virtual camera that is in the backlight state.

第１の実施例：
図１は、本発明による任意視点映像生成システムの第１の実施例を説明するための要部概略構成を示すブロック図で、図中、１０は本発明による任意視点映像生成システムを示し、該任意視点映像生成システム１０は、多眼映像撮影部１、被写体位置及び向き検出部２、仮想カメラ位置及び方向算出部３、映像生成部４、映像出力部５から構成される。 First embodiment:
FIG. 1 is a block diagram showing a schematic configuration of a main part for explaining a first embodiment of an arbitrary viewpoint video generation system according to the present invention. In FIG. 1, 10 indicates an arbitrary viewpoint video generation system according to the present invention. The arbitrary viewpoint video generation system 10 includes a multi-view video shooting unit 1, a subject position and orientation detection unit 2, a virtual camera position and direction calculation unit 3, a video generation unit 4, and a video output unit 5.

多眼映像撮影部１は、被写体の周囲に多数のカメラを配置し、被写体を様々な視点位置及び視線方向から撮影した映像データ群を得るものである。 The multi-view video shooting unit 1 arranges a large number of cameras around a subject, and obtains video data groups obtained by shooting the subject from various viewpoint positions and line-of-sight directions.

被写体位置及び向き検出部２は、一つ以上の被写体の位置や向きを検出する処理を行い、検出処理した情報を仮想カメラ位置及び方向算出部３に送る。 The subject position and orientation detection unit 2 performs processing for detecting the position and orientation of one or more subjects, and sends the detected information to the virtual camera position and direction calculation unit 3.

被写体の位置や向きの検出手段について以下に述べる。
被写体の位置に関しては、例えば、非特許文献３ないし非特許文献４では、多眼映像データから複数の被写体の位置を求める手法が述べられている。また、例えば非特許文献５ないし非特許文献６では、多眼映像データから被写体の向きを推定する手法が述べられている。これらの手法を用いれば、多眼映像撮影部１０１から得られる多眼映像データから複数の被写体の位置や向きを検出することが可能である。
また別の方法としては、被写体にあらかじめＧＰＳ（ＧｌｏｂａｌＰｏｓｉｔｉｏｎｉｎｇＳｙｓｔｅｍ）やジャイロセンサのような被写体の位置や向きを検出できるセンサを持たせ、該センサから取得した情報を利用し、被写体の位置や向きを検出する、という手段を用いることも可能である。
もちろん、その他の手段を用いてもかまわないし、それらの手段を複合して用いてもかまわない。 The means for detecting the position and orientation of the subject will be described below.
Regarding the positions of the subjects, for example, Non-Patent Document 3 to Non-Patent Document 4 describe a method for obtaining the positions of a plurality of subjects from multi-view video data. Further, for example, Non-Patent Document 5 to Non-Patent Document 6 describe a method for estimating the direction of a subject from multi-view video data. By using these methods, it is possible to detect the positions and orientations of a plurality of subjects from multi-view video data obtained from the multi-view video shooting unit 101.
As another method, the subject is provided with a sensor such as a GPS (Global Positioning System) or a gyro sensor that can detect the position and orientation of the subject in advance, and the information obtained from the sensor is used to obtain the position and orientation of the subject. It is also possible to use a means of detecting.
Of course, other means may be used, or these means may be used in combination.

仮想カメラ位置及び方向算出部３は、被写体位置及び向き検出部２から得られた被写体の位置や向きの情報を用いて、被写体に対する仮想カメラの視点位置と視線方向の情報を算出する。この算出方法に関しては、後で、より詳細に説明する。 The virtual camera position and direction calculation unit 3 uses the information on the position and orientation of the subject obtained from the subject position and orientation detection unit 2 to calculate information on the viewpoint position and line-of-sight direction of the virtual camera with respect to the subject. This calculation method will be described in detail later.

映像生成部４では、多眼映像撮影部１からの多眼映像データを用い、仮想カメラ位置及び方向算出部３から送られてきた仮想カメラの視点位置及び視線方向の情報に従って、該仮想カメラから被写体をみた仮想画像を生成する。この仮想画像の生成方法として例えば下記のＡ，Ｂ２つの方法が考えられる。 The video generation unit 4 uses the multi-view video data from the multi-view video shooting unit 1 and from the virtual camera according to the viewpoint position and line-of-sight direction information sent from the virtual camera position and direction calculation unit 3. A virtual image in which the subject is viewed is generated. As a method for generating this virtual image, for example, the following two methods A and B are conceivable.

Ａ．非特許文献１や非特許文献２に記載の任意視点映像生成技術を用い、多眼映像データから、仮想カメラの視点位置及び視線方向における映像を合成する方法である。この場合、元々カメラの存在しない視点位置及び視線方向における映像も合成して表示することが可能である。仮想カメラの視点位置及び視線方向の変化に応じて、映像もスムーズに変化する。また、任意視点映像生成技術を用いれば、左右の目の間隔だけ離れた２つの視点での映像を合成することで、立体表示可能な映像を生成することも可能である。 A. This is a method of synthesizing videos in the viewpoint position and line-of-sight direction of a virtual camera from multi-view video data using the arbitrary viewpoint video generation technology described in Non-Patent Document 1 and Non-Patent Document 2. In this case, it is possible to synthesize and display an image at the viewpoint position and the line-of-sight direction where the camera originally does not exist. The video changes smoothly according to changes in the viewpoint position and line-of-sight direction of the virtual camera. In addition, if an arbitrary viewpoint video generation technique is used, it is also possible to generate a stereoscopically displayable video by synthesizing videos from two viewpoints separated by the distance between the left and right eyes.

Ｂ．仮想カメラの視点位置及び視線方向に最も近い映像を、多眼映像データの中から選び出して表示するものである。簡易的な方法であり、複雑な画像合成処理が不要なのが利点である。その代わりに、元々カメラの存在している場所からの映像しか見ることができず、仮想カメラの位置と方向の変化に対し、該仮想カメラの位置と方向に最も近いカメラへと時々映像が切り替わる形になる。
もちろん、その他の方法を用いてもかまわない。 B. The video closest to the viewpoint position and the line-of-sight direction of the virtual camera is selected from the multi-view video data and displayed. The advantage is that it is a simple method and does not require complicated image composition processing. Instead, only the video from where the camera originally resides can be seen, and the video sometimes switches to the camera closest to the virtual camera's position and direction as the virtual camera's position and direction change. Become a shape.
Of course, other methods may be used.

映像出力部５は、映像生成部４で生成された映像を表示する。一般的な２次元ディスプレイである場合や、立体視が可能な３次元ディスプレイである場合が考えられる。 The video output unit 5 displays the video generated by the video generation unit 4. A case of a general two-dimensional display or a case of a three-dimensional display capable of stereoscopic viewing can be considered.

仮想カメラ位置及び方向算出部３における処理は、被写体に応じていくつかの処理方法が考えられる。以下、それぞれについて詳細に説明する。 For the processing in the virtual camera position and direction calculation unit 3, several processing methods can be considered depending on the subject. Hereinafter, each will be described in detail.

図２は、スポーツ中継に本発明を適用した場合において、仮想カメラの視点位置及び視線方向を算出する第１の例を説明するための図で、図中、２１は被写体として想定したサッカーの選手、２２は仮想カメラ、矢印Ａはサッカーの選手２１の正面の向き、矢印Ｂはサッカーの選手２１の移動方向、矢印Ｃは仮想カメラ２２の視線方向を示している。図２では、サッカーの選手２１が、周りを見渡すために体の向き、つまり、正面の向き（矢印Ａ）を変えながら、順次、矢印Ｂ方向へ移動している様子を鳥瞰している。 FIG. 2 is a diagram for explaining a first example of calculating the viewpoint position and line-of-sight direction of a virtual camera when the present invention is applied to a sports broadcast. In the figure, reference numeral 21 denotes a soccer player assumed as a subject. , 22 indicates a virtual camera, arrow A indicates the front direction of the soccer player 21, arrow B indicates the moving direction of the soccer player 21, and arrow C indicates the line-of-sight direction of the virtual camera 22. In FIG. 2, a bird's-eye view of a soccer player 21 moving sequentially in the direction of arrow B while changing the body direction, that is, the front direction (arrow A) in order to look around.

ここで、被写体位置及び向き検出部２から得られた特定被写体の位置や向きの情報を用いれば、常に被写体に対して一定の距離及び方向に仮想カメラ２２が位置し、仮想カメラ２２が常に被写体の方向（矢印Ｃ）を向くように制御することが可能である。 Here, if information on the position and orientation of the specific subject obtained from the subject position and orientation detection unit 2 is used, the virtual camera 22 is always located at a certain distance and direction with respect to the subject, and the virtual camera 22 is always the subject. It is possible to control so as to face the direction (arrow C).

図２では、被写体の選手２１が位置や方向を変えても、常に選手２１の正面方向（矢印Ａ）で一定の距離に仮想カメラ２２が位置し被写体の方向（矢印Ｃ）を向くよう制御している様子を示している。この制御により、選手２１の位置や方向に関わらず、常に選手２１を正面から一定の距離で見た場合の映像を生成することが可能となる。もちろん、被写体の正面方向だけではなく、被写体の背面、側面、上面など任意の方向に対して同様の処理が可能である。このように仮想カメラ位置を制御することで、常に、一定の見やすい相対位置から被写体を観察することが可能となる。 In FIG. 2, even if the subject player 21 changes position and direction, control is performed so that the virtual camera 22 is always located at a fixed distance in the front direction (arrow A) of the player 21 and faces the subject (arrow C). It shows how it is. With this control, it is possible to always generate an image when the player 21 is viewed at a constant distance from the front regardless of the position and direction of the player 21. Of course, the same processing can be performed not only on the front direction of the subject but also on an arbitrary direction such as the back side, side surface, and top surface of the subject. By controlling the virtual camera position in this way, it is possible to always observe the subject from a certain easy-to-see relative position.

図３は、スポーツ中継に本発明を適用した場合において、仮想カメラの視点位置を算出する第２の例を説明するための図で、図中、２１は被写体として想定した短距離走の選手、２２は仮想カメラ、矢印Ａは短距離走の選手２１の正面の向き、矢印Ｂは短距離走の選手２１の移動方向、矢印Ｃは仮想カメラ２２の視線方向を示している。図３では、短距離走の選手２１が他の選手の様子を知るために周りを見渡しながら走っている様子を示している。 FIG. 3 is a diagram for explaining a second example of calculating the viewpoint position of the virtual camera when the present invention is applied to a sports broadcast. In the figure, 21 is a short-distance running player assumed as a subject, 22 indicates a virtual camera, arrow A indicates the front direction of the sprinter 21, arrow B indicates the direction of movement of the sprinter 21, and arrow C indicates the direction of the virtual camera 22. FIG. 3 shows a state in which a short-distance running player 21 is running while looking around to see the other players.

このとき、被写体位置及び向き検出部２から得られた特定被写体の位置の情報を用いれば、仮想カメラ２２の絶対的な方向は固定したまま常に被写体を撮影し続けるように仮想カメラ２２の視点位置を制御することが可能である。図３に示した第２の例においては、仮想カメラ２２の視線方向（矢印Ｃ）がコース（短距離走の選手２１の移動方向Ｂ）と常に直角になるよう保ちながら、選手２１を画面の中央に捕らえるよう仮想カメラ２２の視点位置を移動させている。このように、仮想カメラ２２の視点位置を制御することにより、例えば１００ｍ走において常に選手の真横の位置から見ることが可能となり、どの選手が現在先頭なのかがはっきりとわかる映像を生成することができる。もちろん、コースと直角になる方向だけではなく任意の方向に仮想カメラの向きを固定して同様の処理を行うことが可能である。 At this time, if the information on the position of the specific subject obtained from the subject position and orientation detection unit 2 is used, the viewpoint position of the virtual camera 22 always keeps photographing the subject while the absolute direction of the virtual camera 22 is fixed. Can be controlled. In the second example shown in FIG. 3, the player 21 is displayed on the screen while keeping the line-of-sight direction (arrow C) of the virtual camera 22 always perpendicular to the course (movement direction B of the sprinter 21). The viewpoint position of the virtual camera 22 is moved so as to be captured at the center. In this way, by controlling the viewpoint position of the virtual camera 22, for example, it is possible to always see from the position directly beside the player in 100 m running, and it is possible to generate an image that clearly shows which player is currently at the top it can. Of course, it is possible to perform the same processing by fixing the orientation of the virtual camera in an arbitrary direction as well as a direction perpendicular to the course.

図４及び図５は、スポーツ中継に本願発明を適用した場合において、仮想カメラの位置を算出する第３の例を説明するための図で、図４及び図５において、２１は被写体として想定したサッカーの選手、２２は仮想カメラ、２３はボール、矢印Ａはサッカーの選手２１の正面の向き、矢印Ｄは該矢印Ｄが示した範囲が仮想カメラ２２の撮影範囲であることを示している。そして、図４では、選手２１の近くにボール２３が存在する状況を、図５では、選手２１がボール２３を蹴ったため選手２１からボール２３が離れつつある状況を示しているとする。このとき、被写体位置及び向き検出部２から得られた特定被写体の位置の情報を用いれば、図４及び図５のいずれの場合でも選手２１とボール２３の両方が撮影範囲内に入るように仮想カメラ２２の位置を制御することが可能である。 4 and 5 are diagrams for explaining a third example of calculating the position of the virtual camera when the present invention is applied to a sports broadcast. In FIGS. 4 and 5, 21 is assumed to be a subject. A soccer player, 22 is a virtual camera, 23 is a ball, arrow A is the front direction of the soccer player 21, and arrow D indicates that the range indicated by the arrow D is the shooting range of the virtual camera 22. 4 shows a situation where the ball 23 is present near the player 21, and FIG. 5 shows a situation where the ball 23 is being separated from the player 21 because the player 21 kicks the ball 23. At this time, if the information on the position of the specific subject obtained from the subject position and orientation detection unit 2 is used, it is assumed that both the player 21 and the ball 23 fall within the shooting range in both cases of FIGS. The position of the camera 22 can be controlled.

図５において、仮想カメラ２２の位置を、図４における仮想カメラ２２の位置より後方に移動させることで、選手２１とボール２３の両方を撮影範囲に収めている。また、前記仮想カメラの位置及び方向を算出する第１及び第２の例と組み合わせて、前記第３の例を用いることも可能である。 In FIG. 5, by moving the position of the virtual camera 22 backward from the position of the virtual camera 22 in FIG. 4, both the player 21 and the ball 23 are within the shooting range. The third example may be used in combination with the first and second examples for calculating the position and direction of the virtual camera.

図６は、スポーツ中継に本発明を適用した場合において、仮想カメラの視点位置及び視線方向を算出する第４の例を説明するための図で、２１は被写体として想定したサッカーの選手、２２は仮想カメラ、矢印Ｂはサッカーの選手２１の移動方向、矢印Ｅはサッカーの選手２１の正面の向き及び仮想カメラ２２の視線方向を示している。図６では、サッカーの選手２１が、周りを見渡すために体の向き、つまり、正面の向き（矢印Ｅ）を変えながら、矢印Ｂ方向へ移動している様子を示している。 FIG. 6 is a diagram for explaining a fourth example of calculating the viewpoint position and line-of-sight direction of the virtual camera when the present invention is applied to a sports broadcast, where 21 is a soccer player assumed as a subject, 22 is The virtual camera, arrow B indicates the moving direction of the soccer player 21, and arrow E indicates the front direction of the soccer player 21 and the line of sight of the virtual camera 22. FIG. 6 shows a situation in which a soccer player 21 moves in the direction of arrow B while changing the body direction, that is, the front direction (arrow E) in order to look around.

ここで、被写体位置及び向き検出部２から得られた特定被写体の位置及び向きの情報を用いれば、仮想カメラ２２の位置及び向きと、選手２１の視点位置及び向きを一致させることで、選手２１の視点位置から選手２１の向いている方向を見た場合の映像、すなわち選手が見ている光景を合成することが可能である。図６では、選手２１の視点位置及び視線方向に合わせ、仮想カメラ２２の位置及び方向が追従している様子を示した。 Here, if the information on the position and orientation of the specific subject obtained from the subject position and orientation detection unit 2 is used, the player 21 can match the position and orientation of the virtual camera 22 with the viewpoint position and orientation of the player 21. It is possible to synthesize an image when the player 21 is seen from the viewpoint position, that is, a scene that the player is viewing. FIG. 6 shows a state in which the position and direction of the virtual camera 22 are following the viewpoint position and line-of-sight direction of the player 21.

第２の実施例：
図７は、本発明による任意視点映像生成システムの第２の実施例を説明するための要部概略構成を示すブロック図で、図中、１０は本発明による任意視点映像生成システムで、該任意視点映像生成システム１０は、多眼映像撮影部１、被写体位置及び向き検出部２、仮想カメラ位置及び方向算出部３、映像生成部４、映像出力部５、情報記録部６から構成される。つまり、図１に示した第１の実施例に対して、情報記録部６を追加したもので、この情報記録部６は、被写体位置及び向き検出部２から得られた特定被写体の位置や向きの情報を処理する際に必要となる不感帯域情報や、特定被写体の位置や向きの過去の情報を記録する部分である。 Second embodiment:
FIG. 7 is a block diagram showing a schematic configuration of a main part for explaining a second embodiment of the arbitrary viewpoint video generation system according to the present invention. In FIG. 7, 10 is an arbitrary viewpoint video generation system according to the present invention. The viewpoint video generation system 10 includes a multi-view video shooting unit 1, a subject position / orientation detection unit 2, a virtual camera position / direction calculation unit 3, a video generation unit 4, a video output unit 5, and an information recording unit 6. That is, the information recording unit 6 is added to the first embodiment shown in FIG. 1, and the information recording unit 6 is the position and orientation of the specific subject obtained from the subject position and orientation detection unit 2. This is a part for recording the dead band information necessary for processing the information and the past information of the position and orientation of the specific subject.

図８は、被写体の向きの情報（被写体の方向情報）に対して不感帯域を設けて被写体の方向情報を算出する例を説明したフロー図である。
まず、被写体位置及び向き検出部２から被写体の向きの情報（角度Ｆとする）を取得する（ステップＳ１）。
次に、角度Ｆが情報記録部６に記録されている不感帯域の範囲内かどうかを判定する（ステップＳ２）。
不感帯域の範囲内である場合には（ステップＳ２でＹＥＳ）、情報記録部６から一時点前の被写体の向き情報を読み出し、仮想カメラの視点位置及び視線方向の算出のためのデータとして出力する（ステップＳ３）。
不感帯域の範囲外である場合には（ステップＳ２でＮＯ）、まず、情報記録部６に新たな不感帯域の下限値として角度Ｆ−α、上限値として角度Ｆ＋αを上書きし（ステップＳ４）、情報記録部６に被写体の向き情報の新しい値として角度Ｆを記録し（ステップＳ５）、角度Ｆを仮想カメラの視点位置及び視線方向の算出用データとして出力する（ステップＳ６）。
最後に、仮想カメラの視点位置及び視線方向を算出する（ステップＳ７）。
以上の処理を映像の各フレームごとに行う。なお、αは、不感帯域のレベル（基準値）である。 FIG. 8 is a flowchart illustrating an example of calculating subject direction information by providing a dead band for subject direction information (subject direction information).
First, information on the direction of the subject (angle F) is obtained from the subject position and orientation detection unit 2 (step S1).
Next, it is determined whether or not the angle F is within the dead band range recorded in the information recording unit 6 (step S2).
If it is within the dead band range (YES in step S2), the orientation information of the subject before the temporary point is read from the information recording unit 6 and output as data for calculating the viewpoint position and the line-of-sight direction of the virtual camera. (Step S3).
If it is out of the dead band range (NO in step S2), first, the information recording unit 6 is overwritten with the angle F-α as the lower limit value of the new dead band and the angle F + α as the upper limit value (step S4). The angle F is recorded as a new value of the subject orientation information in the information recording unit 6 (step S5), and the angle F is output as data for calculating the viewpoint position and line-of-sight direction of the virtual camera (step S6).
Finally, the viewpoint position and line-of-sight direction of the virtual camera are calculated (step S7).
The above processing is performed for each frame of the video. Α is the level of the dead band (reference value).

不感帯域を設けた処理を行わない場合、被写体の向きの変化に追従するために仮想カメラ位置及び方向が常に細かな変化をし、生成画像を見た利用者が映像に酔う可能性がある。
これに対して、上記の不感帯域処理を加えることで、被写体の向きの情報（角度Ｆ）が細かく変化した場合でも、あるレベルα以下の細かい変化を無視することができ、仮想カメラの位置又は／及び方向が細かく変化しすぎることを防ぎ、安定した画像を生成することを可能とする。そのため、利用者が映像に酔うことを防止できる。 When the processing with the dead band is not performed, the virtual camera position and direction always change slightly in order to follow the change in the direction of the subject, and the user who sees the generated image may get drunk on the video.
On the other hand, by adding the dead band processing described above, even if the information on the direction of the subject (angle F) changes finely, a fine change below a certain level α can be ignored, and the position of the virtual camera or / And it is possible to prevent the direction from changing too finely and to generate a stable image. Therefore, it is possible to prevent the user from getting drunk with the video.

なお、図８では被写体の向きの情報に適用した場合の例を説明したが、被写体の位置情報に対して同様の処理を加えることももちろん可能である。 Although the example in the case of applying to the orientation information of the subject has been described with reference to FIG. 8, it is of course possible to add the same processing to the location information of the subject.

また、過去の被写体位置及び方向の情報を情報記録部６に記録しておき、時系列で並べた被写体位置及び方向の情報に対してローパスフィルタ処理をかけることが可能である。該処理により被写体位置及び方向の情報の高周波成分を取り除き、仮想カメラの位置が細かく変化しすぎることを防ぎ、安定した画像を生成することを可能とする。そのため利用者が映像に酔うことを防止できる。 It is also possible to record past subject position and direction information in the information recording unit 6 and apply low-pass filter processing to the subject position and direction information arranged in time series. By this processing, the high frequency component of the information on the subject position and direction is removed, and the position of the virtual camera is prevented from changing too finely, and a stable image can be generated. Therefore, it is possible to prevent the user from getting drunk on the video.

第３の実施例：
図９は、本発明による任意視点映像生成システムの第３の実施例を説明するための要部概略構成を示すブロック図で、図中、１０は本発明による任意視点映像生成システムで、該任意視点映像生成システム１０は、多眼映像撮影部１、被写体位置及び向き検出部２、仮想カメラ位置及び方向算出部３、映像生成部４、映像出力部５、情報記録部６、映像遅延部７から構成される。つまり、図７に示した第２の実施例に対して、映像遅延部７を追加したものである。 Third embodiment:
FIG. 9 is a block diagram showing a schematic configuration of a main part for explaining a third embodiment of the arbitrary viewpoint video generation system according to the present invention. In FIG. 9, 10 is an arbitrary viewpoint video generation system according to the present invention. The viewpoint video generation system 10 includes a multi-view video shooting unit 1, a subject position and orientation detection unit 2, a virtual camera position and direction calculation unit 3, a video generation unit 4, a video output unit 5, an information recording unit 6, and a video delay unit 7. Consists of That is, the video delay unit 7 is added to the second embodiment shown in FIG.

映像遅延部７は、多眼映像撮影部１からの多眼映像データを一旦蓄積し、遅延させてから映像生成部４に送り出す機能を実行する。このように映像を遅延させることで、被写体位置及び向き検出部２から得られる被写体の位置及び向き情報は、映像に対して相対的に前の時間の情報となる。よって、被写体の位置及び向き情報を先読みするのと同等の処理（以下では仮想先読み処理と呼ぶ）を行うことが可能となる。 The video delay unit 7 executes a function of temporarily storing the multi-view video data from the multi-view video shooting unit 1 and sending it to the video generation unit 4 after being delayed. By delaying the video in this way, the subject position and orientation information obtained from the subject position and orientation detection unit 2 becomes information on the previous time relative to the video. Therefore, it is possible to perform processing equivalent to prefetching the position and orientation information of the subject (hereinafter referred to as virtual prefetching processing).

第３の実施例の具体的な例を、図１０と図１１を用いて説明する。図１０、図１１において、２１は被写体として想定したサッカーの選手、２２は仮想カメラ、２３はサッカーボール、矢印Ａはサッカーの選手２１の正面の向き、矢印Ｄは該矢印Ｄが示した範囲が仮想カメラ２２の撮影範囲であることを示している。そして、図１０では、選手２１の近くにボールが存在する状況を、図１１では、選手２１がボール２３を蹴ったため選手からボール２３が離れつつある状況を示しているとする。 A specific example of the third embodiment will be described with reference to FIGS. 10 and 11, reference numeral 21 is a soccer player assumed as a subject, 22 is a virtual camera, 23 is a soccer ball, arrow A is the front direction of the soccer player 21, and arrow D is the range indicated by the arrow D. This indicates that the shooting range is the virtual camera 22. 10 shows a situation where the ball is present near the player 21, and FIG. 11 shows a situation where the ball 23 is being separated from the player because the player 21 kicks the ball 23.

図４および図５で説明した実施例１では、図４および図５のいずれの場合でも選手２１とボール２３の両方が画面に入るように仮想カメラの位置を制御することが可能であることを示した。この実施例１に対して、さらに、仮想先読み処理を加えると、選手２１がボール２３を蹴ってボール２３が離れていくことが先にわかるので早めに仮想カメラの視点位置及び視線方向を動かし、ボール２３の進行方向が画面内（撮影範囲を示す矢印Ｄ内）に入るように仮想カメラの視点位置及び視線方向を制御することが可能となる。 In the first embodiment described with reference to FIGS. 4 and 5, it is possible to control the position of the virtual camera so that both the player 21 and the ball 23 enter the screen in both cases of FIGS. 4 and 5. Indicated. If virtual prefetch processing is further applied to the first embodiment, the player 21 kicks the ball 23 and knows that the ball 23 will leave first, so the viewpoint position and line-of-sight direction of the virtual camera are moved early, It becomes possible to control the viewpoint position and the line-of-sight direction of the virtual camera so that the traveling direction of the ball 23 falls within the screen (inside the arrow D indicating the shooting range).

図１０に示した仮想カメラ２２の位置と図４に示した仮想カメラ２２の位置を比較すると、図１０においては、選手２１がボール２３を蹴ることを先読みして仮想カメラ２２の位置及び向きを制御するため、ボール２３の進行方向を画面内に入れるような仮想カメラの位置及び向きとなっている。同様に、図１１に示した仮想カメラ２２の位置と図５の仮想カメラ２２の位置を比較すると、図１１に示した仮想カメラ２２の位置は、ボール２３の進行方向を画面内に入れるような位置及び向きになっている。 When the position of the virtual camera 22 shown in FIG. 10 is compared with the position of the virtual camera 22 shown in FIG. 4, in FIG. 10, the player 21 kicks the ball 23 and the virtual camera 22 is positioned and oriented. For the purpose of control, the position and orientation of the virtual camera is such that the traveling direction of the ball 23 enters the screen. Similarly, when the position of the virtual camera 22 shown in FIG. 11 is compared with the position of the virtual camera 22 shown in FIG. 5, the position of the virtual camera 22 shown in FIG. Position and orientation.

このような仮想先読み処理を行うことで、視聴者がよりボールを追いやすくなる。また、ボールの進行方向の先にあるものを見やすくなる。あるいは、処理が遅延するため、ボールが画面から出てしまう事を防ぐことが可能となる。 By performing such virtual prefetching processing, it becomes easier for the viewer to follow the ball. Also, it becomes easier to see what is ahead of the moving direction of the ball. Alternatively, since the processing is delayed, it is possible to prevent the ball from coming out of the screen.

なお、仮想カメラの視点位置及び視線方向の制御に用いる情報として、被写体の位置及び向き情報以外の情報も用いることが可能である。例えば、生成画面が逆光状態であるかどうかを判定する手段を用い、逆光状態になる仮想カメラの視点位置及び視線方向は避けるよう、仮想カメラの視点位置及び視線方向を制御することが可能である。あるいは、多眼映像撮影部１の映像から高輝度の被写体を判定し、該被写体が生成画面に入らないように仮想カメラの視点位置及び視線方向を制御することで、逆光状態になる仮想カメラの視点位置及び視線方向を避けることも可能である。 Information other than the subject position and orientation information can also be used as information used for controlling the viewpoint position and line-of-sight direction of the virtual camera. For example, it is possible to control the viewpoint position and the line-of-sight direction of the virtual camera so as to avoid the viewpoint position and the line-of-sight direction of the virtual camera that is in the backlight state by using means for determining whether the generated screen is in the backlight state. . Alternatively, a high-brightness subject is determined from the video of the multi-view video photographing unit 1, and the viewpoint position and line-of-sight direction of the virtual camera are controlled so that the subject does not enter the generation screen. It is also possible to avoid the viewpoint position and the line-of-sight direction.

このように、生成画面が逆光状態にならないことを仮想カメラの視点位置及び視線方向を算出するための所定の条件とすることも可能である。他にも、仮想カメラの視点位置及び視線方向を算出するために様々な所定の条件を用いることが可能である。 As described above, it is also possible to set a predetermined condition for calculating the viewpoint position and the line-of-sight direction of the virtual camera that the generation screen does not enter the backlight state. In addition, various predetermined conditions can be used to calculate the viewpoint position and the line-of-sight direction of the virtual camera.

また、前記逆光状態になる仮想カメラの視点位置及び視線方向を避けるように仮想カメラの視点位置及び視線方向を制御する処理と、前記被写体の位置や方向の情報を用いて仮想カメラの視点位置及び視線方向を算出する処理を併用することも可能である。 In addition, a process of controlling the viewpoint position and the line-of-sight direction of the virtual camera so as to avoid the viewpoint position and the line-of-sight direction of the virtual camera that is in the backlight state, It is also possible to use a process for calculating the line-of-sight direction.

なお、本特許では図１ないし図７ないし図９のように、多眼映像撮影部１によって得られたデータを直接映像生成部４ないし映像遅延部７に入力する形で説明を行ったが、実施の形態はこれに限るものではない。例えば、放送に適用する形、すなわち多眼映像撮影部１で撮影した多眼映像データを電波やネットワークで送信し、視聴者がそれを受信して処理するシステムにおいても、同様の仕組みを適用可能である。また、パッケージメディアに適用する形、すなわち多眼映像撮影部１で撮影した多眼映像データを一旦何らかの記録媒体に記録し、視聴者側でそれを読み出して処理するシステムにおいても、同様の仕組みを適用可能である。
In this patent, as shown in FIG. 1 to FIG. 7 to FIG. 9, the data obtained by the multi-view video photographing unit 1 is directly input to the video generation unit 4 to the video delay unit 7. The embodiment is not limited to this. For example, the same mechanism can be applied to a system that applies to broadcasting, that is, a system in which multi-view video data captured by the multi-view video capturing unit 1 is transmitted via radio waves or a network, and the viewer receives and processes it. It is. A similar mechanism is also applied to a system applied to package media, that is, a system in which multi-view video data captured by the multi-view video capturing unit 1 is temporarily recorded on some recording medium, and is read and processed on the viewer side. Applicable.

本発明による映像システムの第１の実施例を説明するための要部概略構成を示すブロック図である。It is a block diagram which shows the principal part schematic structure for demonstrating the 1st Example of the video system by this invention. 仮想カメラの位置及び方向を算出する第１の例を説明するための図である。It is a figure for demonstrating the 1st example which calculates the position and direction of a virtual camera. 仮想カメラの位置を算出する第２の例を説明するための図である。It is a figure for demonstrating the 2nd example which calculates the position of a virtual camera. 仮想カメラの位置を算出する第３の例を説明するための図である。It is a figure for demonstrating the 3rd example which calculates the position of a virtual camera. 仮想カメラの位置を算出する第３の例を説明するための図である。It is a figure for demonstrating the 3rd example which calculates the position of a virtual camera. 仮想カメラの位置及び方向を算出する第４の例を説明するための図である。It is a figure for demonstrating the 4th example which calculates the position and direction of a virtual camera. 本発明による映像システムの第２の実施例を説明するための要部概略構成を示すブロック図である。It is a block diagram which shows the principal part schematic structure for demonstrating the 2nd Example of the video system by this invention. 被写体の向きの情報（被写体の方向情報）に対し不感帯域を設け、該被写体の方向情報を算出する処理を説明したフロー図である。FIG. 6 is a flowchart illustrating a process of providing a dead band for information on the direction of a subject (subject direction information) and calculating the direction information of the subject. 本発明による映像システムの第３の実施例を説明するための要部概略構成を示すブロック図である。It is a block diagram which shows the principal part schematic structure for demonstrating the 3rd Example of the video system by this invention. 本発明の第３の実施例を説明するための図である。It is a figure for demonstrating the 3rd Example of this invention. 本発明の第３の実施例を説明するための図である。It is a figure for demonstrating the 3rd Example of this invention.

Explanation of symbols

１…多眼映像撮影部、２…被写体位置及び向き検出部、３…仮想カメラ位置及び方向算出部、４…映像生成部、５…映像出力部、６…情報記録部、７…映像遅延部、２１…選手、２２…仮想カメラ、２３…ボール。 DESCRIPTION OF SYMBOLS 1 ... Multi-eye image | video imaging | photography part, 2 ... Subject position and direction detection part, 3 ... Virtual camera position and direction calculation part, 4 ... Image | video production | generation part, 5 ... Image | video output part, 6 ... Information recording part, 7 ... Image | video delay part , 21 ... Player, 22 ... Virtual camera, 23 ... Ball.

Claims

The viewpoint position and line of sight of a virtual camera that satisfies a predetermined condition in a video system that synthesizes or selects an image in the viewpoint position and line of sight of an arbitrary virtual camera from an image of a subject taken from one or more viewpoint positions and line of sight An arbitrary viewpoint video generation system comprising a calculation means for calculating a direction.

Means for detecting the position and orientation of one or more subjects, and using the information on the position and orientation of one or more subjects obtained by the detection means, the viewpoint position and line of sight of a virtual camera that satisfies the predetermined condition The arbitrary viewpoint video generation system according to claim 1, wherein a direction is calculated.

3. The arbitrary viewpoint video generation according to claim 2, wherein the predetermined condition is that the virtual camera is positioned in a specific direction with respect to the subject, and the line-of-sight direction of the virtual camera faces the direction of the subject. system.

3. The arbitrary viewpoint video according to claim 2, wherein the predetermined condition is that the visual point direction of the virtual camera is fixed and the viewpoint position is changed, and the subject can be always photographed from the viewpoint position and the visual line direction of the virtual camera. Generation system.

5. The arbitrary viewpoint video generation system according to claim 2, wherein the predetermined condition is that a plurality of subjects are included in a video in a viewpoint position and a line-of-sight direction of a virtual camera.

3. The arbitrary viewpoint video generation system according to claim 2, wherein the predetermined condition is that a viewpoint position and a line-of-sight direction of a subject coincide with a viewpoint position and a line-of-sight direction of a virtual camera.

The arbitrary viewpoint video generation system according to any one of claims 2 to 6, wherein a process of providing a dead band is performed on information on a position and a direction of a subject.

The arbitrary viewpoint video generation system according to any one of claims 2 to 7, wherein a low-pass filter process is applied to information on a position and a direction of a subject.

The detection means for detecting the position and orientation of one or more subjects is a detection means for detecting one or more subjects using a multi-view image captured from one or more viewpoint positions and line-of-sight directions. The arbitrary viewpoint video generation system according to any one of claims 2 to 8.

The detection means for detecting the position and orientation of the one or more subjects is detection means for detecting using one or more subjects photographed from above. The arbitrary viewpoint video generation system according to 1.

The detection means for detecting the position and orientation of the one or more subjects has a sensor capable of detecting the location and orientation of the subject in the subject itself, and uses the information on the location and orientation of the subject acquired by the sensor. The arbitrary viewpoint video generation system according to any one of claims 2 to 8, wherein the arbitrary viewpoint video generation system is detection means for detecting.

The arbitrary viewpoint video generation system according to any one of claims 2 to 8, further comprising a detection unit that is a combination of the detection units of the detection units according to claims 9 to 11.

Arbitrary viewpoint video generation technology for synthesizing video in the viewpoint position and line-of-sight direction of the virtual camera based on video obtained by photographing the certain subject from one or more viewpoint positions and line-of-sight directions using the calculation result by the calculation means 13. The arbitrary viewpoint video generation system according to claim 1, wherein the arbitrary viewpoint video generation system is used.

Using the calculation result by the calculation means, select an image most similar to the image in the viewpoint position and line-of-sight direction of an arbitrary virtual camera from the images obtained by photographing the certain subject from one or more viewpoint positions and line-of-sight directions, and select The arbitrary viewpoint video generation system according to any one of claims 1 to 12, wherein the generated video is used as a video in a viewpoint position and a line-of-sight direction of an arbitrary virtual camera.

An image obtained by photographing a subject from one or more viewpoint positions and line-of-sight directions is delayed for a predetermined time, and an image in the viewpoint position and line-of-sight direction of any virtual camera is synthesized or delayed based on the delayed image. 15. The method according to claim 1, further comprising: selecting an image most similar to an image in a viewpoint position and a line-of-sight direction of an arbitrary virtual camera from the captured image, and setting the image in the viewpoint position and the line-of-sight direction of an arbitrary virtual camera. Or the arbitrary viewpoint video generation system according to claim 1.

16. The video obtained by photographing a subject from one or more viewpoint positions and line-of-sight directions is delayed for a predetermined time, and the viewpoint position and line-of-sight direction of the virtual camera are calculated relatively earlier than the video. The arbitrary viewpoint video generation system described in 1.

17. The arbitrary viewpoint video generation system according to claim 1, further comprising means for outputting a video generated at a viewpoint position and a line-of-sight direction of the virtual camera.

The arbitrary viewpoint video generation system according to claim 17, wherein a display of a two-dimensional display is used as the means for outputting the generated video.

The arbitrary viewpoint video generation system according to claim 17, wherein a stereoscopic display is used as the means for outputting the generated video.

20. The apparatus according to claim 1, further comprising means for detecting a condition in which the subject in the generated video is backlit, wherein the predetermined condition is that the subject in the generated video is not backlit. Or the arbitrary viewpoint video generation system according to claim 1.

21. The arbitrary viewpoint video generation system according to claim 20, wherein backlight detection is performed using video obtained by photographing a certain subject from one or more viewpoint positions and line-of-sight directions.