[go: up one dir, main page]

TW543323B - Multiple camera control system - Google Patents

Multiple camera control system Download PDF

Info

Publication number
TW543323B
TW543323B TW90124363A TW90124363A TW543323B TW 543323 B TW543323 B TW 543323B TW 90124363 A TW90124363 A TW 90124363A TW 90124363 A TW90124363 A TW 90124363A TW 543323 B TW543323 B TW 543323B
Authority
TW
Taiwan
Prior art keywords
target
data set
background
image data
scope
Prior art date
Application number
TW90124363A
Other languages
Chinese (zh)
Inventor
Evan Hildreth
Francis Macdougall
Original Assignee
Jestertek Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jestertek Inc filed Critical Jestertek Inc
Application granted granted Critical
Publication of TW543323B publication Critical patent/TW543323B/en

Links

Landscapes

  • Image Analysis (AREA)

Abstract

A multiple camera tracking system for interfacing with an application program running on a computer is provided. The tracking system includes two or more video cameras arranged to provide different viewpoints of a region of interest, and are operable to produce a series of video images. A processor is operable to receive the series of video images and detect objects appearing in the region of interest. The processor executes a process to generate a background data set from the video images, generate an image data set for each received video image, compare each image data set to the background data set to produce a difference map for each image data set, detect a relative position of an object of interest within each difference map, and produce an absolute position of the object of interest from the relative positions of the object of interest and map the absolute position to a position indicator associated with the application program.

Description

543323 A7 B7 五、發明説明(1 ) 相關申請案之參考 本申請案係欲對2000.10.3提出之美國臨時申請案第 60/23 7,187號(名稱為“雙攝影機控制系統(DUAL CAMERA CON丁ROL· SYSTEM)’’)主張其優先權,茲併於此以為參 述。 技術領域 本發明與一種物體追蹤系統有關,尤其是與以物體追蹤 及介面控制系統為基之攝影機有關。 背景 目前可見到各式與電腦系統介面以及用以控制電腦系統 之操作系統。諸多此類操作系統均採用以一般為人所接受 之圖形使用者介面(GUI)函數及控制技術為基之標準化介 面函數。結果便於對平台及/或應用相當熟悉之使用者控 制各式電腦平台及使用者應用,此係因自一 GUI至另一 GUI之函數及控制技術隨處可見。 一廣為人所接受之控制技術即為利用滑鼠或軌跡球型指 向裝置來移動游標於螢幕物件之上。一個動作,諸如在物 件上按一下或兩下,即可執行GUI函數。然而,對某些不 熟悉電腦滑鼠操作者而言,點選GUI函數即屬一大挑戰, 進而對他們與電腦系統之介面造成阻礙。在有些情況下, 亦有無法使用滑鼠或軌跡球的狀況發生,諸如在城市街道 上之百貨公司展示窗前,或是在群眾前站在大型顯示幕前 演講時。 概要 -4- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 543323 A7 一 ___B7 五、發明説明(2 ) 在一普遍態樣中揭示一種追蹤目標物之方法。此方法包 括取得代表不同觀點目標物之第一影像及第二影像,以及 將第一影像處理為第一影像資料集,並將第二影像處理為 第二影像資料集。此方法更包括對第一影像資料集與第二 影像資料集之處理,產生與背景有關之背景資料集,並由 第衫像:貝料集與背景/貝料集間之差異決定產生第一差異 圖,以及由第一影像資料集與背景資料集間之差異決定產 生第二差異圖。此方法亦包括對在第一差異圖中目標物之 第一相對位置,以及在第二差異圖中目標物之第二相對位 置之偵測,並自目標物之第一及第二相對位置產生目標物 之絕對位置。 將第一影像處理為第一影像資料集以及將第二影像處理 為第二影像資料集之步驟可包括對第一及第二影像之活動 影樣區之決定,以及自在活動影像區範圍内所含之第一及 第一影像擷取活動影像資料集。擷取活動影像資料集之步 驟可包括一或多個第一及第二影像之割除、第一及第二影 像之轉動,或是第一及第二影像之剪裁。 、在一實際施行中,擷取活動影像資料集之步驟可包括將 活動影像資料集配置為含列與行之影像像素陣列。擷取步 驟更包括對在各影像像素陣列行範圍内之最高像素值之辨 識,並產生具一列之資料集,其中對各行辨識之最高像素 值係表該行。 ' 2第一影像處理為第一影像資料集以及將第二影像處理 為第二影像資料集亦可包括對第一及第二影像之篩選。篩 543323 A7543323 A7 B7 V. Description of the invention (1) Reference to related applications This application is US Provisional Application No. 60/23 7,187 (named "Dual Camera Control System (DUAL CAMERA CON DING ROL)) · SYSTEM) ") claims its priority and is hereby incorporated by reference. TECHNICAL FIELD The present invention relates to an object tracking system, and more particularly to a camera based on an object tracking and interface control system. Background can be seen Various types of computer system interfaces and operating systems used to control computer systems. Many of these operating systems use standardized interface functions based on generally accepted graphical user interface (GUI) functions and control technologies. The results are convenient Users who are quite familiar with the platform and / or application control various computer platforms and user applications. This is because functions and control technologies from one GUI to another can be seen everywhere. A widely accepted control technology is Use your mouse or trackball-type pointing device to move the cursor over the screen object. An action, such as clicking or double clicking on the object You can execute the GUI function. However, for some unfamiliar computer mouse operators, clicking on the GUI function is a big challenge, which will hinder their interface with the computer system. In some cases, it is impossible to Situations such as using a mouse or trackball, such as in front of a department store display window on a city street, or speaking in front of a large display in front of the crowd. Summary-4- This paper standard applies to China National Standards (CNS) A4 specification (210 X 297 mm) 543323 A7 _B7 V. Description of the invention (2) Reveal a method of tracking the target in a general form. This method includes obtaining a first image and a second image of the target representing different viewpoints. Two images, and processing the first image into a first image data set, and processing the second image into a second image data set. This method further includes processing the first image data set and the second image data set to generate and The background data set related to the background, and the first difference map is determined by the difference between the first shirt image: the shell material set and the background / shell material set, and the difference between the first image data set and the background data set The difference decides to generate a second difference map. This method also includes detecting the first relative position of the target in the first difference map and the second relative position of the target in the second difference map. The first and second relative positions generate the absolute position of the target. The steps of processing the first image into a first image data set and processing the second image into a second image data set may include activities on the first and second images. The decision of the movie sample area and the first and first images contained in the range of the moving image area to capture the moving image data set. The step of capturing the moving image data set may include one or more first and second images. Cut, rotation of the first and second images, or cropping of the first and second images. In an actual implementation, the step of capturing the moving image data set may include configuring the moving image data set as an image pixel array with rows and rows. The fetching step further includes identifying the highest pixel value within the range of each image pixel array row, and generating a data set with one column, wherein the highest pixel value identified for each row is the table row. '2 The first image processing as the first image data set and the second image processing as the second image data set may also include filtering the first and second images. Sieve 543323 A7

匕包括1第一及第二影像邊緣之掏取。篩選更可包括對 π傻I像g料集以及*二影像資料集之處理,俾突顯第-々像負料集與背景資料集夕田 ^ -s. ^ Μ ^ 、 二并,以及第二影像資料集與 月景資料集之差異。 對第一影像資料集以及第-畢彡 景資料集可包括產生第—:::!. t 理產生之背 、、 或夕、,且與第一影像資料集有 關〜#景資料集,以及產生第一 生罘一組一或多組與第二影像資 枓集有關之背景資料集。 產生弟-組-或多組背景資料集可包括在代表背景之第 一影像資料集範圍内,產生代表最大資料值之第一背景 組’以及產生第二組—或多組背景資料集可包括在代表背 景之第二影像資料集範園内,產生代表最大資料值之第二 :景=。產生更可包括對代表表η景之資料最大值之第 一及第二背景集’以-預定值增加在第-及第二背景集範 圍内之值。 產生第一組一或多個背景資料集可包含產生第一背景 集’其係表在代表背景之第_影像資料集之範圍内之最小 值,以及產生第二組一或多個背景資料集可包含產生第二 背景集,其係表在代表背景之第二影像資料集之範圍内之 最小值。產生更可包含對第一及第二背景集,其係表代表 背景之資料最小值,以一預定值減低在第一及第二背景集 範圍内之值。 產生第一組背景資料集可包括產生取樣第一影像資料 -6 - 本紙張尺度適用中國國家標準(CNS) Α4規格(210X 297公爱)The dagger includes 1 first and second image edge extraction. The screening may further include processing of the π silly image data set and the second image data set, highlighting the negative image set of the first image and the background data set. Yuta ^ -s. ^ Μ ^, and the second image The difference between the data set and the moonscape data set. For the first image data set and the second-biggest scene data set, the first image data set can be generated by:-!:, Or, and related to the first image data set ~ # 景 数据 集, and Generate a first set of one or more sets of background data sets related to the second image resource set. Generate brother-group- or multiple sets of background data sets may be included within the scope of the first image data set representing the background, generate a first background set representing the largest data value 'and generate a second set—or multiple sets of background data sets may include In the second image data set fan garden representing the background, the second one representing the largest data value is generated: scene =. The generation of the first and second background sets, which can further include the maximum value of the data representing the scene n, is increased by a predetermined value to a value within the range of the first and second background sets. Generating the first set of one or more background data sets may include generating the first set of background data, which is a minimum value in the range of the _ image data set representing the background, and generating a second set of one or more background data sets. It may include generating a second background set, which is a minimum value within the range of the second image data set representing the background. The generation may further include the minimum values of the first and second background sets, which represent the minimum value of the background data, and reduce the values within the range of the first and second background sets by a predetermined value. Generating the first set of background data sets may include generating sampled first image data. -6-This paper size applies the Chinese National Standard (CNS) Α4 specification (210X 297 public love)

裝 訂Binding

543323 A7543323 A7

$而產生第二組背景資料集 料集。敗娣i、人_、 工w保弟一影像資 7 ;預定時段間自動發生,其中各取;^ γ # 與背景無關之資料。 ^取樣可包括 資圍:一或多組背景資料集可包括在保持在各背景 ;、巳圍内足罘一影像資料集之多重取樣,而產生第二 殂一或多組背景資料集可包括在保持在各背景資料集範圍 内之第二影像資料集之多重取樣。 枓集乾圍 …屋生各第一背景資料集可包括自多重取樣中選擇代表在 ::影像資料集範圍内之各單元背景之單一值,而產生各 第二背景資料集可包括自乡重取樣巾選擇代表 資料集範圍内之各單元背景之單一值。 Μ 在其它實際施行中,產生可包括對第一影像資料集盥背 景資料集之次集之比較’以及對第二影像資料集與背景资 料集之次集之比較。 在其它實際施行中,產生第一差異圖更可包括將在第一 影像資料集中各單元表為兩種狀態之一,而產生第二差里 圖更可包括將在第二影像資料集中各單元表為兩種狀態ς 一,其中的兩種狀態係表值是否與背景相容。 “ 在其它實際施行中,偵測可包括對在第一及第二差異圖 中叢集之辨識,其中各叢集均具本身狀態在其相關差異圖 範圍内之單元,指示與背景不相容之單元。 辨識叢集更可包括將差異圖降至一列,此係藉由計算在 與背景不相容之行範圍内之單元為之。 辨識叢集更可包括辨識行在叢集範圍内,以及將鄰近行 543323 A7 ________B7 五、發明説明(5 ) "~" '~ --- 歸類在叢集内。辨識行在叢集範圍肉十飞—丄 敢术乾固内亦可包括辨識中間 行。 辨識叢集更可包括辨識與叢集有關之位置。辨識與叢集 有關之位置可包括計算在叢集内單元之加權平均"π 偵測更可包括歸類叢集為目標物。歸類叢集更可包括計 算在叢集内之單元,而僅在計數超出預定臨界值時,始歸 類叢集為目標物。歸類叢集更可包括計算在叢集内之單 元,以及計算經分類為與差異圖内背景不相容之單元城 數’而僅有在叢集内之單元數對單元總數比超出預定臨界 值時,始歸類叢集為目標物。 偵測步驟更可包括對在叢集内之次叢集之辨識,其係表 目標物之指標端,以及對目標物位置之辨識。 在上述實際施行中,目標物可為使用者的手,且此方法 可包括利用目標物之絕對位置來控制應用。 上述實際施行更可包括取得代表不同觀點之目標物之第 三影像及第四影像,以及將第三影像處理為第三影像資料 集,並將第四影像處理為第四影像資料集;對第三影像資 料集與第四影像資料集之處理,產生與背景有關之背景资 料集。此方法亦可包括由第三影像資料集與背景資料^間 之差異決定產生第三差異圖,以及由第四影像資料集與^ 景資料集間之差異決定產生第四差異圖;對在第三差異圖 中目標物之第三相對位置,以及在第四差異圖中目標物之 第四相對位置之偵測。可自目標物之第一、第二、第三及 第四相對位置產生目標物之絕對位置。 543323 A7 B7$ To generate a second set of background data sets.娣 i, person _, worker, bao, bao, yi, xi, yi, qi, qi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, xi, yi, ren, yi only, etc. automatically occur during the predetermined period, and each of them will be taken; ^ γ # background-independent data. ^ Sampling may include information: one or more sets of background data sets may be included in each background; multiple sampling of one image data set within the Wai Wai, and the second one or more sets of background data sets may include Multisampling of the second image data set within the range of each background data set.枓 Set Qianwei ... Each first background data set of the house can include a single value selected from the multiple sampling to represent the background of each unit within the range of the image data set, and each second background data set can include the hometown weight The sampling towel selects a single value that represents the background of each unit within the range of the data set. Μ In other practical implementations, the generation may include a comparison of the second set of the background image set of the first image data set and a comparison of the second set of the second image data set and the background data set. In other practical implementations, generating the first difference map may further include listing each unit in the first image data set as one of two states, and generating the second difference map may further include each unit in the second image data set. The table is one of two states, two of which are table values that are compatible with the background. "In other practical implementations, detection may include identification of clusters in the first and second difference maps, where each cluster has a unit whose state is within the scope of its related difference map, indicating a unit that is incompatible with the background Identifying clusters can also include reducing the difference map to one row, which is calculated by calculating cells in the range of rows that are incompatible with the background. Identifying clusters can also include identifying rows within the cluster and reducing adjacent rows 543323 A7 ________B7 V. Description of the invention (5) " ~ " '~ --- It is classified in the cluster. The identification line is in the cluster range. The flesh is tenfold—the daring technique can also include the identification of the middle line. May include identifying cluster-related locations. Identifying cluster-related locations may include calculating a weighted average of the units in the cluster " π detection may also include classifying the cluster as a target. The classifying cluster may also include calculating within the cluster Clusters, and only when the count exceeds a predetermined threshold, clusters are classified as targets. Clustering clusters can also include cells that are calculated in the cluster, and calculations that are classified as a difference from the difference graph. The number of incompatible unit cities', and only when the ratio of the number of units in the cluster to the total number of units exceeds a predetermined threshold, the cluster is initially classified as the target. The detection step may further include identifying the secondary cluster in the cluster. It refers to the indicator end of the target and the identification of the target's position. In the above-mentioned actual implementation, the target can be the user's hand, and this method can include using the absolute position of the target to control the application. The above actual Implementation may further include obtaining a third image and a fourth image representing targets of different viewpoints, and processing the third image into a third image data set, and processing the fourth image into a fourth image data set; The processing of the data set and the fourth image data set generates a background data set related to the background. This method may also include generating a third difference map from the difference between the third image data set and the background data ^, and the fourth image The difference between the data set and the scene data set determines the fourth difference map; the third relative position of the target in the third difference map, and the target in the fourth difference map The relative positions of the fourth detection may be generated absolute position of the object from the first, second, third, fourth and relative position of the object. 543323 A7 B7

五、發明説明(6 目標物可為使用者的手V. Description of the invention (6 The target can be the user's hand

此方法包括自至少兩觀點取得 ,產生各取得影像之影像資料 在部分此種實際施行中,目標物可為合 包括利用目標物之絕對位置來控制應用。 在另一態樣中 來控制追蹤目標物之方法。 影像;處理所取得之影像, 用互動。 此外, 集;以及將各影像資料集與一或多組背景資料集做比較, 產生對各取得影像之差異圖。此方法亦包括對在各差異圖 中目標物之相對位置偵測;自目標物之相對位置產生目標 物之絕對位置;以及利用絕對位置使得使用者可與電腦^This method includes obtaining from at least two viewpoints and generating image data of each acquired image. In some such actual implementations, the target may be combined to control the application by using the absolute position of the target. In another aspect, control the method of tracking the target. Image; processing the acquired image, using interaction. In addition, sets; and comparing each image data set with one or more sets of background data sets to generate a difference map for each acquired image. This method also includes detecting the relative position of the target in each difference map; generating the absolute position of the target from the relative position of the target; and using the absolute position to allow the user to communicate with the computer ^

用有關之勞幕座標,以及利用映對位置做為與電腦應用之 介面。此方法亦可包括藉由對目標物絕對位置變化之分析 來組織與目標物有關之手勢;以及將絕對位置及手勢合併 成為與電腦應用之介面。 在另一態樣中,揭示一種與在電腦上執行之應用程式有 所連、,々之夕重知f影機追縱系統。此多重攝影機追縱系統包 含二或多個影像攝影機,其配置使其可提供所欲之各種觀 點,並可操作產生一系列的視訊影像。處理器可操作接收 該系列視訊影像,並偵測目標區域中出現之物體。處理器 可執行下列處理:自視訊影像產生背景資料集;對各接收 之視訊影像產生影像資料集並比較各影像資料集與背景資 料集’產生對各影像資料集之差異圖;偵測在各差異圖中 目標物之相對位置;以及自目標物之相對位置產生目標物 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 543323 ΑΊ B7 五、發明説明( 之&對位置’並將絕對位置映至與應用相連之位置指標。 、在上述實際施行中’目標物可為人手。此外,目標區可 疋為與電腦連結之攝影顯示前方。此處理器可操作將目標 物之、’’巴對位置映至位置指標,使得攝影顯示上的位置指標 位置對準目標物。 目標區可定在與電腦連結之攝影顯示前方任何距離處, 並且處理器可操作將目標物之絕對位置映至位置指標,使 得攝影顯示上的位置指標位置對準由目標物所指位置。或 者目標區可定在與電腦連結之攝影顯示前方任何距離處, 並且處理器可操作將目標物之絕對位置映至位置指標,使 裝 得目標物之移動尺度與在攝影顯示上的位置指標之較大位 移一致。 處理器可配置成與電腦滑鼠功能相仿。此可包括將處理 =成可利用自目標物之移動推估之手勢而與電腦滑鼠 2翁鍵相仿。對—預定期間内可容忍之目標物位置可 由應用程式内之選擇動作觸發之。 線 田2對—預定期間内可容忍之目標物位置,處理器可配 且成人電腦滑鼠控制按鍵相仿。對_財期間θ,在交互 =顯示區邊界内之可容忍目標物位置可由應用程式内之 選擇動作觸發之。 4=一預定期間内’在交互作用顯示區邊界内之位置 2可谷忍位置,處理器可配置成與電腦滑鼠控制按鍵相 在上述態樣中’背景資料集可包格以至少部分靜態結構 -10-Use the relevant screen coordinates and use the mapping position as an interface with computer applications. This method can also include organizing gestures related to the target by analyzing changes in the absolute position of the target; and merging absolute positions and gestures into an interface with a computer application. In another aspect, a connection with an application program running on a computer is revealed, and the movie camera tracking system is revisited. This multi-camera tracking system includes two or more video cameras that are configured to provide a variety of viewpoints as desired and can be operated to produce a series of video images. The processor is operable to receive the series of video images and detect objects appearing in the target area. The processor can perform the following processing: generate a background data set from the video image; generate an image data set for each received video image and compare each image data set with the background data set 'to generate a difference map for each image data set; The relative position of the target in the difference map; and the target is generated from the relative position of the target. The paper size is applicable to the Chinese National Standard (CNS) A4 specification (210 X 297 mm) 543323 ΑΊ B7 5. Description of the invention (of & right Position 'and map the absolute position to the position indicator connected to the application. In the actual implementation, the' target object can be a human hand. In addition, the target area can be displayed in front of the photographic display connected to the computer. This processor can operate the target The position of the object is mapped to the position indicator, so that the position indicator position on the photographic display is aligned with the target object. The target area can be set at any distance in front of the photographic display connected to the computer, and the processor can operate the target object. The absolute position is mapped to the position index, so that the position index position on the photographic display is aligned with the position pointed by the target. Or the target area Set at any distance in front of the photographic display connected to the computer, and the processor can operate to map the absolute position of the target to the position index, so that the movement scale of the mounted object is consistent with the larger displacement of the position index on the photographic display The processor can be configured to resemble the function of a computer mouse. This can include processing = a gesture that can be estimated from the movement of the target object and resemble a 2 mouse button of a computer mouse. Right-tolerable target within a predetermined period The position of the object can be triggered by a selection action in the application. 2 pairs of line fields-the position of the target object that can be tolerated within the predetermined period, the processor can be equipped and the adult computer mouse control buttons are similar. For _ financial period θ, during interaction = display The location of the tolerable target within the boundary of the area can be triggered by a selection action in the application. 4 = The location within the boundary of the interactive display area within a predetermined period 2 The location can be tolerated. The processor can be configured to interact with the computer mouse The control button phase is in the above aspect. The 'background data set can be packed with at least part of the static structure-10-

543323 A7 --B7 五、發明説明~- =示之資料點。在此實際施行中,該至少部分靜態結構可 =括一攝影機可見之圖樣化表面。靜態結構可為一窗形框 _ °或者,靜態結構可包括一率光線。 在另恐樣中,揭7^ 一種與在電腦上執行之應用程式有 、°之夕重知1影機追縱系統。此系統包含二或多個影像 攝影機,其配置使其可提供所欲之各種觀點,並可操作產 系列的視訊影像。處理器可操作接收該系列視訊影 ,並偵測目標區域中出現之物體。處理器可執行下列處 垤=自视訊影像產生背景資料集;對各接收之視訊影像產 生影像資料集;比較各影像資料集與背景資料集,產生對 各衫像資料集之差異圖;偵測在各差異圖中目標物之相對 —且,自目標物之相對位置產生目標物之絕對位置;定出 目帖區内之次區域;辨識為目標物佔據之次區域;當目標 物佔據所辨識之次區域時,即啟動與所辨識之次區域有關 的動作;以及將該動作與應用程式交互作用。 在上述實際施行中,目標物可為人手。此外,與所辨識 义次區域有關的動作可與和應用程式連結之鍵盤鍵之啟動 相仿。在相關的實際施行中,對一預定期間内,在任何次 區域内之可容忍目標物位置可由動作觸發之。 或更多個實際施行細節係示於隨附之圖式及下列敘述 中。自此描述、圖式及申請專利範圍即可清楚瞭解其它特 徵及優點。 圖式描述 圖1所示係多重攝影機系統之典型施行硬體部件及其典 -11 - 本紙張尺度適用中國國豕標準(CNS) Α4規格(21〇 X 297公爱) 543323543323 A7 --B7 V. Description of the invention ~-= Data points shown. In this practical implementation, the at least part of the static structure may include a patterned surface visible by a camera. The static structure may be a window frame _ °. Alternatively, the static structure may include a rate of light. In another example, there are 7 ^ one and the application running on the computer have, ° evening regain 1 camera tracking system. This system contains two or more video cameras that are configured to provide as many viewpoints as desired and operate a series of video images. The processor is operable to receive the series of video images and detect objects appearing in the target area. The processor can perform the following steps: = Generate background data set from video image; Generate image data set for each received video image; Compare each image data set with background data set to generate a difference map for each shirt image data set; Detect The relative position of the target in each difference map—and, the absolute position of the target is generated from the relative position of the target; the sub-area within the target frame is identified; the sub-area occupied by the target is identified; When the secondary zone is activated, an action related to the identified secondary zone is initiated; and the action is interacted with the application. In the above practical implementation, the target may be a human hand. In addition, actions related to the identified sub-regions can be performed similarly to the activation of keyboard keys associated with an application. In the related actual implementation, for a predetermined period, the position of the tolerable target in any sub-region can be triggered by an action. One or more actual implementation details are shown in the accompanying drawings and the following description. From the description, the drawings and the scope of patent application, other features and advantages will be clearly understood. Schematic description Figure 1 shows the typical implementation of the hardware components of the multiple camera system and its code. -11-This paper size applies to China National Standard (CNS) Α4 specification (21〇 X 297 public love) 543323

型實際佈局。 圖2A所示係圖丨之攝影機 係。 與各式影像區間之典型幾何關 所捕捉之典型影像。 係在與多重攝影機系統連 圖2B所示係由圖丨攝影機之— 圖3所示處理流程圖,其典型 結之微電腦程式内執行。、土 、圖/丁係圖〕所不部分處理之細部流程圖,尤其是物體 七、彳及其自僻影機所捕捉之影像信號擷取之位置的處 理。 、圖5A所示係自部分圖4所示處理產生以及由攝影機取得 之取樣影像資料,示如灰階點圖影像。 一圖5B所π係自部分圖4所示處理產生之取樣影像資料, 示如灰階點圖影像。 圖5C所不係自部分圖4所示處理產生之取樣影像資料, 示如灰階點圖影像。 圖5D所不係自部分圖4所示處理產生之取樣影像資料, 示如灰階點圖影像。 圖5Ε所係自部分圖4所示處理產生之取樣資料,示如 灰階點圖象;辨識在取樣中追縱而概屬於物體之像素。 圖6所示係圖4所示部分處理之細部流程圖,尤其是包含 於像素圖所給定之物體的處理,其中該像素圖係概屬於所 追縱之物體而辨認之像素,例如圖5Ε所示資料。 圖7Α所示係圖5Ε之取樣資料,其係以二元點圖影像示 之’而圖6所示資料取樣之辨識處理已在此取樣中選取與 -12- 本紙張尺度適用中國國家標準(CNS) Α4规格(21〇χ 297公釐) 543323 A7 _ B7 五、發明説明(10 ) 物體有關者。 圖7B所示係圖5E之取樣資料,其係以條狀圖示之,而 圖6所示資料取樣之辨識處理已在此圖中經辨識之特殊點 中選取與物體有關者。 圖7 C所示係取樣資料差分集,其係以二進位位元圖影像 示之’而圖6所示資料取樣之辨識處理已在此取樣中經選 擇為與物體及物體之關鍵部位有關者。 圖8所示係圖4所示部分處理之細部流程圖,尤其是包含 於物體擒住之背景區的描述之產生及維持之處理。 圖9A所示係基於方程式3之幾何圖形,亦即在攝影機視 野内部分物體限定之角度,其係由物體被感側處之影像平 面位置所給定。 圖9B所示係基於方程式4、5及6之幾何圖形,亦即攝影 機位置與被追蹤物體間的關係。 圖10所示係方程式8之說明圖,亦即可將該抑制量施予 物體位置改變所給定之座標,俾做位置微調。 圖11A係受控於系統之應用程式示例,其中目標物控制 螢幕指標於二維空間中。 圖11B所示係在實際座標與螢幕座標間之應對,其利用 圖1 1A中的應用程式為之。 圖12A及12B係受控於多重攝影機控制系統之應用程式 示例’其中目標物控制螢幕指標於三維虛擬實境環境中。 圖13A所示係利用手勢偵測法,將目標區分割為偵測 面,以辨識與所欲啟動有關之手勢。 -13- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公董) 543323 A7 B7 五、發明説明(11 ) 圖13B所示係利用手勢偵測法,將目標區分割為偵測 箱,以辨識與所選之游標方向有關之手勢。 圖13C所示係另一利用手勢偵測法,將目標區分割為偵 測箱,以辨識與所選之游標方向有關之手勢。 圖13D係對圖13C之相鄰分割關係的細部描述。 在各圖式中類似的代號係指類似的單元。 細部描述 圖1所示係與一影像檢視系統相連之多重攝影機移動追 蹤與控制系統100。在此施行中,兩攝影機1 〇 1與102掃描 目標區103。一受控或已知背景104環繞於目標區1〇3周 遭。當目標物105進入目標區103時,即為系統追蹤。目標 物10 5可為任何插入目標區1 〇 3中的非特定物,其一般為系 統使用者的手或手指。目標物1 〇5亦可為選擇裝置,諸如 指標。 自攝影機101與102所得之系列視訊影像經計算裝置或影 像處理器106之傳遞。在此施行中,計算裝置係一般用途 之電腦,其可執行附加軟體,提供反饋予在攝影顯示1 〇7 上的使用者。 圖2A說明多重攝影機控制系統1〇〇之典型施行狀況。兩 攝影機10 1與1 02係位於目標區} 〇3之外。攝影機可轉向使 其視野(對攝影機101為205,對攝影機i〇2為206)重疊處204 元全包含目標區103。此轉向使得攝影機1(H、1〇2在近乎 平仃的軸上轉動。在此示例中,天花板、或窗沿即側壁提 供了具相異邊緣之受控背景1〇4。為攝影機丨〇1所捕捉之對 -14- &張尺度適用中i國家標準(CNS) A4規格(21〇^^----- 543323Type actual layout. The camera system shown in Fig. 2A is shown in Fig. 2A. Related to the typical geometry of various image intervals Typical images captured. The system is connected to a multiple camera system. Figure 2B shows the processing flowchart shown in Figure 丨 Camera-Figure 3, which is typically executed in a microcomputer program. (Tu, Tu, Ding / Ding Ding)] is a detailed flow chart of the part of the processing, especially the processing of the position of the image signal captured by the object VII, 彳 and its secluded camera. 5A is a sample image data generated from the processing shown in FIG. 4 and obtained by a camera, such as a grayscale dot map image. A π shown in FIG. 5B is a sample image data generated from the processing shown in part of FIG. 4, such as a grayscale point image. FIG. 5C does not refer to the sampled image data generated from the processing shown in FIG. 4, such as a grayscale point image. FIG. 5D does not refer to the sampled image data generated from the processing shown in FIG. 4, such as a grayscale point image. Figure 5E refers to the sample data generated from the processing shown in Figure 4 in part, showing a grayscale point image; identifying pixels that are traced in the sample and belong to the object. Figure 6 is a detailed flowchart of the processing shown in Figure 4, especially the processing included in the object given by the pixmap. The pixmap is a pixel identified by the tracked object, such as that shown in Figure 5E. Show information. The sample data shown in Figure 7A is shown in Figure 5E, which is shown in a binary dot image image. The identification processing of the data sample shown in Figure 6 has been selected in this sample. CNS) Α4 specification (21〇χ 297 mm) 543323 A7 _ B7 V. Description of the invention (10) Object related. Fig. 7B shows the sampling data of Fig. 5E, which is shown in the form of a bar, and the identification processing of the data sampling shown in Fig. 6 has selected relevant objects from the identified special points in this figure. The difference set of sampling data shown in Figure 7C is shown in the binary bitmap image, and the identification processing of the data sampling shown in Figure 6 has been selected in this sampling to be related to objects and key parts of objects. . Fig. 8 is a detailed flowchart of the processing shown in Fig. 4, especially the process of generating and maintaining the description of the background area captured by the object. The geometry shown in Figure 9A is based on Equation 3, that is, the angle defined by some objects in the field of view of the camera, which is given by the position of the image plane on the sensitive side of the object. The geometry shown in Figure 9B is based on equations 4, 5 and 6, which is the relationship between the camera position and the tracked object. Fig. 10 is an explanatory diagram of Equation 8; that is, the suppression amount can be applied to the coordinates given by the object position change, and the position can be fine-tuned. Figure 11A is an example of an application controlled by the system, where the target controls the screen pointer in a two-dimensional space. Figure 11B shows the response between the actual coordinates and the screen coordinates, which is done using the application in Figure 11A. Figures 12A and 12B are examples of applications controlled by a multi-camera control system, where the target control screen indicators are in a three-dimensional virtual reality environment. Figure 13A shows the use of gesture detection to divide the target area into detection surfaces to identify gestures related to the desired activation. -13- This paper size applies the Chinese National Standard (CNS) A4 specification (210 X 297 public directors) 543323 A7 B7 V. Description of the invention (11) Figure 13B uses the gesture detection method to divide the target area into detection Box to identify gestures related to the selected cursor direction. Figure 13C shows another method that uses gesture detection to divide the target area into detection boxes to identify gestures related to the selected cursor direction. FIG. 13D is a detailed description of the adjacent division relationship of FIG. 13C. Similar symbols in the drawings refer to similar units. Detailed Description FIG. 1 shows a multiple camera movement tracking and control system 100 connected to an image viewing system. In this implementation, the two cameras 101 and 102 scan the target area 103. A controlled or known background 104 surrounds the target area 103. When the target 105 enters the target area 103, it is tracked by the system. The target 105 can be any non-specific object inserted into the target area 103, which is generally a hand or finger of a system user. The target 105 can also be a selection device, such as an index. The series of video images obtained from the cameras 101 and 102 are transmitted by the computing device or the image processor 106. In this implementation, the computing device is a general-purpose computer that can execute additional software to provide feedback to users on the photographic display 107. FIG. 2A illustrates a typical implementation of the multiple camera control system 100. The two cameras 101 and 102 are located outside the target area. The camera can be turned so that its field of view (205 for camera 101 and 206 for camera 102) includes the target area 103 at the overlap position. This turn causes the camera 1 (H, 102 to rotate on a nearly flat axis. In this example, the ceiling, or window, or side wall, provides a controlled background 104 with distinct edges. For the camera 丨 〇1 The captured pair is -14- & Zhang scale applicable to China National Standard (CNS) A4 specifications (21〇 ^^ ----- 543323

AT ——____B7 ^發明説明(12 ) '~ 應視野示如圖2B。雖未示出攝影機1〇2所捕捉之视野,應 知其係由攝影機101所捕捉視野之映像影像。受控背景 未完全覆蓋攝影機的所有視野2〇5亦可。對各攝影機而 言’均可見一完全包含於受控背景1〇4内之活動影像區 208,並亦包含整個目標區1〇3。背景1〇4係受控使得背景 特性可模式化,並且目標物105,不論部份或整體,二二 在孩特性下《背景104相異。當目標物1〇5在目標區内 出現時,目標物105將擋住在攝影機1〇1、1〇2之活動影像 區208内部分的受控背景104,不論部份或整體,所捕=之 影像將具所選之特性而與受控背景丨〇4之模式相容。 概言之,如發現目標物105的位置在兩攝影機之活動影 像區208内時,即可辨識並計算之。利用攝影機ι〇ι、1〇2 之位且 > 料,以及與目標區1 〇3有關之攝影機位置,即可 计异出在目標區103内之目標物1〇5。 此處理係由影像處理器106(圖丨)執行之,其可經由軟體 處理或經由硬體處理施行,概示於圖3。攝影機影像係7 攝影機101、102同時傳遞,並為影像接收模組3〇4、3〇5(分 別)捕捉,傳入影像處理器106内之影像緩衝器3〇6、3〇7(分 別)。景> 像偵測模組308、309獨立偵測在各影像中的目標 物105,並決足其與攝影機視界之相對位置。自兩攝影機 視界所得之相對位置資訊310、311經組合模組312組合 之’並視為要由位置微調模組3 1 3微調之,在方塊3 14中決 疋在目標區103内目標物1〇5之整體顯現與位置。視需要可 在手勢偵測模組315中偵測由使用者施行之特定手勢。接 -15- 本紙張尺度適用中國國家標準(CNS) Α4規格(210 X 297公釐) 543323 A7AT ——____ B7 ^ Explanation of the invention (12) '~ The visual field of the application is shown in Figure 2B. Although the field of view captured by the camera 102 is not shown, it should be understood that it is an image of the field of view captured by the camera 101. Controlled background 205 does not completely cover all fields of view of the camera. For each of the cameras, a moving image area 208 completely contained in the controlled background 104 is visible, and the entire target area 103 is also included. The background 104 is controlled so that the background characteristics can be modeled, and the target 105, regardless of the part or the whole, is different from the background 104. When the target 105 appears in the target area, the target 105 will block part of the controlled background 104 in the moving image area 208 of the camera 101, 102, regardless of the part or the whole, captured == The image will have the selected characteristics and be compatible with the controlled background mode. In summary, if the position of the target 105 is found in the moving image area 208 of the two cameras, it can be identified and calculated. Using the position of the camera ιιι, 〇2 and > material, and the position of the camera related to the target area 103, the target 105 in the target area 103 can be calculated. This processing is performed by the image processor 106 (Figure 丨), which can be executed by software processing or hardware processing, as shown in FIG. 3. Camera image system 7 Cameras 101 and 102 are transmitted at the same time, and are captured by the image receiving modules 304 and 305 (respectively), and transmitted to the image buffers 306 and 307 (respectively) in the image processor 106. . Scenery > The image detection modules 308 and 309 independently detect the target 105 in each image and determine its relative position with respect to the camera horizon. The relative position information 310 and 311 obtained from the two camera sights are combined by the combination module 312 and regarded as being to be fine-tuned by the position fine-tuning module 3 1 3, and the target 1 in the target area 103 is determined in block 3 14 〇5 the overall appearance and location. If necessary, a specific gesture performed by the user can be detected in the gesture detection module 315. -15- This paper size applies to China National Standard (CNS) A4 (210 X 297 mm) 543323 A7

著將手勢偵測處理結果 另一處理或應用316中, 細敘述手勢偵測處理。 傳遞至在同一影像處理器106上之 或是另一處理裝置中。以下將詳In another process or application 316, the gesture detection processing result will be described in detail. To the same image processor 106 or another processing device. Will be detailed below

Hold

•4 影像偵測模組308與309在執行處理中完全相同。一影 偵測模組308、309之施行係示於圖4中。在方塊4〇2中以影 像處理器106擷取儲存於影像緩衝器3〇6或3〇7中的捕捉Z 像資料,此影像資料係與活動影像區2〇8相對應(圖2β) = 影像可經篩選處理403之篩選,以突顯或擷取背景1〇4與目 標物105相異處之影像態樣或特性,否則即在背景1〇4中不 隨時間而變。在某些施行中,代表活動影像區之資料亦可 經尺度模組404縮減,俾將計算量縮減至後續處理步驟中 所需量。利用所得資料,背景1〇4在方塊4〇5中,經一或多 個背景模式處理狀況而模式化,產生一或多個以受控背景 104之背景模式資料406表示之描述。因此,背景1〇4經模 式化為所欲之影像態樣或特性項。背景模式4〇6經傳遞至 一組在處理407中的標準。在比較處理4〇8中,所篩選之 (來自處理403)及/或縮減之(來自模組4〇4)影像資料與這些 標準(來自處理407)做比較,而目前資料與背景模式資料 406不相容處,亦即與標準不符處,經儲存於影像或差異 圖409中。在偵測模組410中,差異圖409經分析決定是否 任何此類不相容足堪做為目標物1 〇5之可能指標,並決定 這些標準是否與其在攝影機視野(205或206)内之位置相 符。目標物105位置可在區塊411中做進一步地微調(視需 要),其產生與攝影機有關之與目標物1〇5相關之顯現及位 -16- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 543323 A7 B7 ) 五、發明説明(14 置輸出310或311(如圖3所述)。 在圖41方塊4 0 2中’影像處理器1 〇 6掏取與活動影像區 208(圖2B)相對應之影像資料。影像資料可經由對所捕捉 之影像資料的割除、剪裁、轉動或轉換而擷取之。割除僅 擷取在活動影像區208内所有影像的一部份。限定邊界, 並將在邊界内所有的像素原封不動地複製至新緩衝器中, 同時忽略在邊界外之像素。活動影像區2〇8可為任意形 狀。剪裁與轉動將資料再排序至便於更進一步處理^ ^ 序,諸如矩形,使其可定址於像素之行列項。 轉動處理使得所顯示之影像内容宛如影像經過轉動一 般。轉動處理使得像素位置按下列方程式自(x,y)再排序成 (x,,y,): 一 /" X cos0 一 sin0 〇' X / y sin0 cos0 〇 y _1_ 0 〇 1_ 丄 其中θ為影像轉動角度。 如攝影機101與102相對於目標區1〇3而正確固定,欲轉 動之角度一般將不大。如所欲轉動之角度不大’可利^前 裁提供較轉動簡單之計算驅近。剪裁使得影像外形失真, 致使經轉換(外形看起來像列與行已彼此上下滑動。前 按下列方程式將像素位置再排序: a戟 543323• 4 The image detection modules 308 and 309 are exactly the same in the execution process. The implementation of a shadow detection module 308, 309 is shown in FIG. In block 402, the image processor 106 is used to capture the captured Z image data stored in the image buffer 3 06 or 3 07. This image data corresponds to the moving image area 208 (Figure 2β) = The image can be screened by the screening process 403 to highlight or capture the image state or characteristics of the background 104 which is different from the target 105, otherwise it will not change with time in the background 104. In some implementations, the data representing the moving image area can also be reduced by the scale module 404, thereby reducing the amount of calculation to the amount required in subsequent processing steps. Using the obtained data, the background 104 is modeled in block 405 by one or more background mode processing conditions to generate one or more descriptions represented by the background mode data 406 of the controlled background 104. Therefore, the background 104 is patterned into a desired image appearance or characteristic item. The background mode 406 is passed to a set of standards in process 407. In comparison processing 408, the filtered (from processing 403) and / or reduced (from module 404) image data is compared with these standards (from processing 407), and the current data is compared with the background mode data 406 Incompatibilities, that is, inconsistencies with the standard, are stored in the image or difference map 409. In the detection module 410, the difference map 409 is analyzed to determine whether any such incompatibility is sufficient as a possible indicator of the target 105, and to determine whether these standards are in line with the camera's field of view (205 or 206). The positions match. The position of the target 105 can be further fine-tuned in block 411 (as needed), which generates the camera-related manifestation and position related to the target 105-16-This paper standard applies Chinese National Standard (CNS) A4 Specifications (210 X 297 mm) 543323 A7 B7) 5. Description of the invention (14 sets output 310 or 311 (as described in Figure 3). In Figure 41, box 4 02, 'Image processor 1 06 extraction and activities The image data corresponding to the image area 208 (Fig. 2B). The image data can be retrieved by cutting, cropping, rotating or transforming the captured image data. The removal only captures one of all the images in the active image area 208 Part. Define the boundary, and copy all the pixels inside the boundary to the new buffer, while ignoring the pixels outside the boundary. The moving image area 208 can be any shape. Crop and rotate to reorder the data In order to facilitate further processing of the sequence, such as rectangles, it can be addressed in the ranks of pixels. Rotation processing makes the displayed image content as if the image has been rotated. Rotation processing makes the pixel position according to the following equation ( x, y) are then sorted into (x ,, y,):-/ cos0-sin0 〇 'X / y sin0 cos0 〇y _1_ 0 〇1_ 丄 where θ is the image rotation angle. For example, cameras 101 and 102 are opposite It is fixed in the target area 103 correctly, and the angle to be rotated is generally not large. If the angle to be rotated is not large, the front crop provides a simpler calculation to drive closer. The crop will distort the shape of the image, causing Transformation (The shape looks like the columns and rows have been sliding up and down each other. Reorder the pixel positions according to the following equation: aji 543323

AT ______B7 五、發明説明(15 ) 其中shx係表在影像内之水平剪裁量,而shy係表在影像内 之垂直剪裁量。 將多重攝影機控制系統1〇〇施用於目標物1〇5之情境中, 不輪整體或部分,類似於具有較受控背景104高或低的照 明。例如:可將背景104打光產生此情境。篩選方塊4〇3通 經與影像資料有關之照明資訊。單一背景模式406係表所 期之背景104照明。實際上,受控背景ι〇4之照明可在活動 影像區208内變化,因此背景模式4〇6可儲存在活動影像區 2 0 8内對所有像素所期之照明值。比較標準產生處理4 〇 7考 1 k號雜訊(上述可在背景模式内考量),以及藉由自背景 模式406修改各照明值而使得受控背景ι〇4之照明微量變 化,因此產生微量照明值,其可被歸類為與背景模式 相容。例如:如受控背景1〇4之照明高於目標物1〇5之照 明,則處理方塊407會以大於所期信號及照明可變化度之 數值來降低各像素照明值。 在系統100的某些施行中,目標區1〇3窄到足以使其可經 模式化為一平面區。該平面之轉向與圖1中用以表示目標 區103足點互方體足前後面平行。在視需要之尺度模組牝4 中,活動影像區2 0 8可經縮減為單列像素。如符合下列兩 條件:1)當偵測到目標物105時,將在活動影像區2〇8某些 行《所有列中擔住背景1G4,以及2)在背景模式4Q6中單組 值即足以描述在活動影像區2〇8中所有行像素之特性,則 在視需要之尺度模組404中,即可將活動影像區⑽縮減為 早列像素。如活動影像區208較目標物1〇5薄,則第一項條 543323 A7 B7 五、發明説明(16 ) 件常符合。第二項條件則係藉由上述方塊403、405、406 及407之施行而相符。尺度模組404之施用可降低在執行後 續處理時所需之處理複雜度,並可降低背景模式406之儲 存需求。 尺度模組404之特定施行與處理方塊403、405、406及407 之特性有關。如預期受控背景104之照明高於目標物1 05, 如上述,尺度模組404之施行係以在該行内之最大照明值 表示各行。此處理具額外助益,受控背景104之高照明部 無須充滿整個受控背景104。 在情境中施用之另一狀況,其中受控背景104係為靜 止,亦即不含運動,但在照明上不受限。以在圖5A所含 取樣源影像做為示例。在此狀況中,為攝影機所感側之目 標物可包含或大小封閉為照明值,其亦可見於受控背景 104。 實際上,受控背景104之照明可變化度(例如因使用 者在裝置前移動所致,藉此將某些週遭光線框住)大小明 顯與受控背景104和目標物105間差別有關。因此,可將一 特殊類型篩選器施用於篩選處理403中,產生與整體照明 無關或可變化度不高的結果,同時可突顯部分目標物 105。 一般將3x3 Prewitt篩選器使用於篩選處理403中。圖 5B所示係圖5A影像上之篩選處理403結果。在此施行中,可 維持兩背景模式406,其各自表高低值,共同表各篩選像 素所期值之範圍。接著比較標準產生處理407以大於所期 之信號雜訊及照明可變化度的量來降低低值及升高高值。 結果係一組標準,其對低值之示例示如圖5C,而對高值之 -19- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 543323 A7 _-_________ B7 五、發明説明(17 ) 二i j示i圖5 D。這些經修改之影像通過比較處理* 〇 $,如 其值低於低值標準(圖5C )或高於高值標準(圖5 D卜則其將 像素歸類為與受控背景1〇4不相容。結果為一二元差異圖 409,其示例係於圖5β相對應,示於圖。 先前之施行使得許多既存之如表面、壁或窗框得以做為 受控背景104,其中該等表面可具任意照明度、組織、邊 緣’甚或固足於受控背景104之光線帶。上述施行亦使得 包含如預定圖樣或組織、光帶之受控背景1〇4得以使用, 其中上述處理偵測在受控背景1〇4為目標物1〇5阻擋區域中 短缺之圖樣。 差異圖409儲存所有以上述方法尋得之與受控背景1〇4不 相容的像素位置。在此施行中,差異圖4〇9可以任意影像 表之,其中各像素可處於兩種狀態之一。該等與受控背景 104不相容之像素藉由設定在差異圖中對應之列或行之像 素於該等狀態之一而辨識或變質。 一種偵測模組410之施行,其中在差異圖409中對目標物 105之偵測,示如圖6。另一在方塊603之尺寸模組提供額 外降低資料至一維陣列資料的機會,視需要並可施於目標 物105之轉向對在差異圖4〇9中之目標物1〇5整體邊界無顯 著影響之情境。實際上,此係施加至諸多列數低於或類似 於目標物105中所佔之典型行數的情境中。在施加時,在 方塊603之尺寸模組將差異圖409降低至單列圖,亦即一維 陣列值。在此施行中,尺寸模組603可計算在各差異圖409 行中標示之像素數。舉一示例,圖7Α之差異圖409可以此 -20- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 543323 A7 _ __B7 五、發明説明(18 ) 方式降低並繪如圖7B之圖像709。施加此附加處理步驟來 降低處理需求及簡化一些後續計算。 繼續此偵測模組410之施行,觀察到標示於差異圖4〇9(例 如圖7A)中與目標物105有關之像素一般將構成叢集7〇1, 但叢集换需被連結。叢集辨識處理6〇4將像素歸類為是否 為叢集701之一員(或者如已施加尺寸模組6〇3,則將行歸 類)。存在並可施加的有多種取樣尋找叢集方法,且已某 於簡化處理而選擇後續方法。注意當目標物1〇5出現時, 可旎正確的標示像素數將超出誤-正(false-p〇sitives)數。因 此中間位置可期落於目標物105範圍内某處。部分叢集辨 識處理604之施行,當施加於單列圖時(例如:在方塊6〇3 或404之尺寸模組已施加處),則其係計算中間行7〇2及標 示行為部分叢集701(圖7B )(如其在與所期佔據之最高行數 相對應之預定距離703内)。部分叢集辨識處理604之施 行’當施於多重列圖時,如與鄰近-距離標準相符,將標 示之像素加入叢集703中。 在此施行中,一組標準為叢集歸類處理6〇5所接收,接 著利用叢集701來確認叢集足堪與所期之目標物ι〇5相符。 因此’處理605決定是否叢集701將被歸類為屬於目標物 10 5。部分叢集歸類處理6 0 5之施行係為計算在叢集7 〇 1内 之標示像素數,並計鼻出所有的標示像素數。在叢集701 内之計數與臨界值相比,消除在叢集中標示像素太少,視 為目標物105之不相符部分。在叢集701内之像素數與總數 比和臨界值相較,更進一步降低不相符處。 -21 - 本紙張尺度適用中國國家標準(CNS) A4规格(210 X 297公釐) A7 B7 543323AT ______B7 V. Description of the Invention (15) where shx is the horizontal trimming amount in the image, and shy is the vertical trimming amount in the image. In the context of applying the multiple camera control system 100 to the target 105, the whole or part is not rounded, similar to illumination with a higher or lower control background 104. For example, the situation can be created by lighting the background 104. The filter box 40 passes the lighting information related to the image data. The single background mode 406 is the background 104 lighting as expected. In fact, the lighting of the controlled background ιo4 can be changed in the moving image area 208, so the background mode 406 can be stored in the moving image area 208 for the expected lighting value for all pixels. The comparison standard generates and processes 4 〇1 No. 1 noise (the above can be considered in the background mode), and by modifying each lighting value from the background mode 406, the lighting of the controlled background ι04 is changed slightly, thus generating a small amount Illumination value, which can be classified as compatible with the background mode. For example, if the lighting of the controlled background 104 is higher than the lighting of the target 105, the processing block 407 will reduce the lighting value of each pixel by a value greater than the desired signal and the lighting changeability. In some implementations of the system 100, the target area 103 is narrow enough to allow it to be modeled as a planar area. The turning of this plane is parallel to the front and back of the foot cube of the target area 103 shown in Figure 1. In the optional scale module 视 4, the moving image area 208 can be reduced to a single row of pixels. If the following two conditions are met: 1) When the target 105 is detected, some rows in the moving image area 208 will bear the background 1G4 in all columns, and 2) a single set of values in the background mode 4Q6 is sufficient Describe the characteristics of all the rows of pixels in the moving image area 208, then in the scale module 404 as needed, the moving image area can be reduced to the pixels in the early column. If the moving image area 208 is thinner than the target 105, the first item 543323 A7 B7 5. The description of the invention (16) is often consistent. The second condition is met by the execution of blocks 403, 405, 406, and 407. The application of the scale module 404 can reduce the processing complexity required when performing subsequent processing, and can reduce the storage requirements of the background mode 406. The specific implementation of the scale module 404 is related to the characteristics of the processing blocks 403, 405, 406, and 407. If the illumination of the controlled background 104 is expected to be higher than that of the target 105, as described above, the implementation of the scale module 404 represents each row with the maximum illumination value in the row. This process has the added benefit that the highly illuminated portion of the controlled background 104 need not fill the entire controlled background 104. Another condition applied in the context, where the controlled background 104 is stationary, i.e. does not include motion, but is not limited in lighting. Take the sample source image contained in Figure 5A as an example. In this case, the target that is the sensing side of the camera may include or be closed to the size of the lighting value, which can also be seen in the controlled background 104. In fact, the magnitude of the change in the illumination of the controlled background 104 (for example, caused by the user moving in front of the device, thereby framing some ambient light) is significantly related to the difference between the controlled background 104 and the target 105. Therefore, a special type of filter can be applied to the screening process 403 to produce a result that is irrelevant to the overall illumination or has a low degree of variability, while highlighting a portion of the target 105. A 3x3 Prewitt filter is generally used in the filtering process 403. Figure 5B shows the results of the screening process 403 on the image of Figure 5A. In this implementation, the two background modes 406 can be maintained, each of which represents the high and low values, and collectively represents the range of values expected for each of the selected pixels. The comparison standard generation process 407 then decreases the low value and raises the high value by an amount larger than the expected signal noise and lighting changeability. The result is a set of standards. An example of the low value is shown in Figure 5C. For the high value, -19- This paper size applies the Chinese National Standard (CNS) A4 specification (210 X 297 mm) 543323 A7 _-_________ B7 V. Description of the invention (17) IIij is shown in FIG. 5D. These modified images are processed by comparison * 〇 $, if the value is lower than the low value standard (Figure 5C) or higher than the high value standard (Figure 5D), it classifies the pixels as different from the controlled background 104. The result is a binary difference graph 409, an example of which corresponds to FIG. 5β, and is shown in the graph. Previous implementations allowed many existing surfaces such as surfaces, walls, or window frames to be used as the controlled background 104, where these surfaces It can have any illumination, organization, edge 'or even the light band fixed to the controlled background 104. The above implementation also enables the use of a controlled background 104 including a predetermined pattern, or tissue, and light band, among which the above processing detection In the controlled background 104 is a pattern that is short of the target 105 blocking area. The difference map 409 stores all pixel positions found in the above method that are incompatible with the controlled background 104. In this implementation, The difference map 409 can be any image table, in which each pixel can be in one of two states. The pixels that are not compatible with the controlled background 104 are set in the corresponding column or row of pixels in the difference map. Wait for one of the states to recognize or deteriorate. The implementation of the measurement module 410, wherein the detection of the target 105 in the difference map 409 is shown in Fig. 6. Another size module at block 603 provides an opportunity to reduce the data to one-dimensional array data, as needed and The turn that can be applied to the target 105 has no significant effect on the overall boundary of the target 105 in the difference chart 409. In fact, this is applied to many columns with a number lower than or similar to that of the target 105 In the context of a typical number of rows. At the time of application, the size module of block 603 reduces the difference map 409 to a single-line diagram, that is, a one-dimensional array value. In this implementation, the size module 603 can calculate the difference The number of pixels marked in the row of Figure 409. For example, the difference chart 409 of Figure 7A can be -20- This paper size applies the Chinese National Standard (CNS) A4 specification (210 X 297 mm) 543323 A7 _ __B7 V. Invention Explanation (18) The method reduces and draws the image 709 as shown in Figure 7B. This additional processing step is applied to reduce processing requirements and simplify some subsequent calculations. Continue the implementation of this detection module 410, and observe the difference shown in Figure 409 (E.g. FIG. 7A) related to the target 105 Pixels will generally form cluster 701, but cluster swapping needs to be linked. Cluster identification processing 604 classifies pixels as whether they are a member of cluster 701 (or if the size module 603 has been applied, the rows are classified ). There are multiple sampling and clustering methods that can be applied, and the subsequent method has been selected to simplify the process. Note that when the target 105 appears, the number of correctly labeled pixels will exceed false-positive (false- p〇sitives) number. Therefore, the middle position can be expected to fall somewhere within the range of the target 105. Part of the cluster identification processing 604 is performed when applied to a single-line diagram (for example: the size module in box 603 or 404 has Where it is applied), it calculates the middle row 702 and the marked behavior part cluster 701 (FIG. 7B) (if it is within a predetermined distance 703 corresponding to the highest number of rows occupied). Implementation of the partial cluster identification processing 604 'When applied to a multi-column map, if it conforms to the proximity-distance criterion, the marked pixels are added to the cluster 703. In this implementation, a set of criteria is received by the cluster classification process 605, and then the cluster 701 is used to confirm that the cluster is sufficiently consistent with the desired target ι05. Therefore, the process 605 determines whether the cluster 701 will be classified as belonging to the target 105. The implementation of partial cluster classification processing 605 is to calculate the number of labeled pixels in the cluster 701, and calculate the total number of labeled pixels. Compared with the threshold in the cluster 701, the number of pixels marked in the cluster is eliminated, which is regarded as the non-conforming part of the target 105. The number of pixels in the cluster 701 is compared with the total ratio and the threshold value to further reduce the inconsistencies. -21-This paper size applies to Chinese National Standard (CNS) A4 (210 X 297 mm) A7 B7 543323

多、 發明説明 19 ) 如叢集701通過這些標準,在方塊606中對叢集之描述微 調,此係由在處理607中與叢集70 1有關之重心計算所得。 雖然由尺寸模組603所找到之中間位置可能在目標物1〇5之 邊界範圍内’其無需位於目標物中心。加權平均7 1 〇或重 /ν> ί疋供對叢集位置之較佳測量’並視需要可在處理Mg中 計算如次處理607。加權平均7 1 〇係以下列方程式計算之:19) If cluster 701 passes these standards, the description of the cluster is fine-tuned in block 606, which is calculated from the center of gravity associated with cluster 701 in processing 607. Although the intermediate position found by the size module 603 may be within the boundary of the target 105, it does not need to be located at the center of the target. A weighted average of 7 1 0 or weight / ν > is used for better measurement of the cluster position ' and may be calculated in the processing Mg as the secondary processing 607 if necessary. The weighted average of 7 1 0 is calculated by the following equation:

Hold

ICM jc=:0 其中:ί係平均值 C係行數 C[x]係在X行之標示像素數 玎ICM jc =: 0 Where: ί is the average value C is the number of lines C [x] is the number of marked pixels in the X line 玎

叢集邊界704亦可視需要在處理606中計算之,示如處理 6〇8。叢集703可包含一些誤-正分離物,做為部分此施 行,邊界可以環繞於標示像素之一定百分比定之,或者在 所期相當少標示像素之情境中。環繞標示像素(或行,如 施予尺寸模組603時),形成緊密次叢集,亦即其鄭近處亦 被標示之標示像素(或行)。 除中央與邊界座標外,視需要可由叢集動量之計算來推 估目標物Π35幅向。此幅向係以在處理6〇6内之次處:里6〇9 的叢集幅向計算處理表之。 在系統100的某些應用中,目標物1〇5係做為指標之用。 在此m ’如目標區1()3包含足夠列數Μ數未經縮 減’則可預期並且目標㈣5之指向端可由在處㈣6内之 -22-The cluster boundary 704 can also be calculated in the process 606 as needed, as shown in the process 608. The cluster 703 may contain some false-positive separators. As part of this, the boundary may be set around a certain percentage of the labeled pixels, or in the situation where relatively few labeled pixels are expected. Surround the marked pixels (or lines, such as when the size module 603 is applied) to form a close sub-cluster, that is, the marked pixels (or lines) that are also marked near Zheng. Except for the central and boundary coordinates, the direction of the target object can be estimated from the cluster momentum calculation as required. This aspect is based on the cluster aspect in the second place within the processing of 606: 609 to calculate the processing table. In some applications of the system 100, the target 105 is used as an indicator. Here m ′, if the target area 1 () 3 contains a sufficient number of columns and the number M is not reduced, then it can be expected and the pointing end of target ㈣5 can be within -22 of ㈣6.

543323 A7 B7 五、發明説明(2〇 ) 指向端計算次處理決定之。一示例續如圖7 C。一般而言, 目標物105將自該區之以之邊界進入或因受限而進入活動 影像區208。目標物105之指向端705(例如使用者的指端)可 能為與進入活動影像區208之進入區域706相距最遠之叢集 70 1的部分。叢集70 1可包含一些誤-正分離物。如此一 來,指向端705可為在叢集701内之區域707所限定,其圍 繞於接近叢集70 1最遠邊界側之多重標示像素,或者,在 所期相當少像素被標示處之情境中,圍繞於構成緊密次叢 集之最遠標示像素;亦即該等鄰近處亦經標示之標示像 素。此次叢集係由次叢集指向端處理6 1 〇以及在處理611中 所知次叢集位置所辨識。 繼續此施做,視需要可以一平順模組612施做一處理, 施加於任何或所有在處理606中所知位置。平順化係一合 併先前解出的結果之處理,使其自框架至框架以穩定方式 移動。由重心決定處理6 〇 7所得之加權平均座標7 1 〇與諸多 取樣有關,並因此而自然穩定。由叢集邊界尺寸決定處理 6〇8所得之邊界704,以及由611所得之指向端7〇5座標與相 對較少叢集組有關,且單一像素狀態可具顯著影響。由於 所期為目標物105所佔區域尺寸保持相對穩定,可將平順 化施於邊界704間相對於叢集加權平均座標71〇測量之距 離。由於所期之目標物1〇5幅向與外型較目標物1〇5整體位 置的變化相對較慢,可將平順化施於指向端7〇5相對於叢 集加權平均座標7 1 〇測量之距離。 在重心處理607所採用之處理係為下列方程式i : -23-543323 A7 B7 V. Description of the invention (20) The calculation of the secondary end is determined by the pointing end. An example continues as shown in Figure 7C. Generally speaking, the target 105 will enter from the boundary of the area or enter the moving image area 208 due to restrictions. The pointing end 705 of the target 105 (for example, the finger end of the user) may be the part of the cluster 70 1 farthest from the entering area 706 entering the moving image area 208. Cluster 701 may contain some false-positive isolates. In this way, the pointing end 705 may be defined by the area 707 within the cluster 701, which surrounds the multiple labeled pixels near the farthest boundary side of the cluster 701, or, in the situation where relatively few pixels are marked, The farthest marking pixels that form a close sub-cluster; that is, the marking pixels that are also marked in these neighborhoods. This cluster system is identified by the secondary cluster pointing end processing 6 10 and the location of the secondary cluster known in processing 611. Continuing with this process, a smoothing module 612 may perform a process as needed, and apply it to any or all locations known in process 606. Smoothing is a process that combines the results of previous solutions to move it from frame to frame in a stable manner. The center of gravity determines the weighted average coordinate 7 1 0 obtained from processing 6 07, which is related to many samplings and is therefore naturally stable. The cluster boundary size determines the boundary 704 obtained by processing 608, and the pointing end 705 obtained by 611 is related to relatively few cluster groups, and a single pixel state can have a significant impact. Since the size of the area occupied by the target 105 is expected to remain relatively stable, smoothing can be applied to the distances between the boundaries 704 measured relative to the cluster weighted average coordinates 71. Since the desired target 105 changes relatively slowly to the overall position of the target 105, the smoothing can be applied to the pointed end 705 as measured relative to the cluster weighted average coordinate 7 10. distance. The processing adopted at the center of gravity processing 607 is the following equation i: -23-

543323 A7 ___ B7 五、發明説明(21 ) s(t) = (axr(t)) + ((i.a)xs(t-l)) 在方程式1中,在時間t(s(t))時之平順值等於一減去純量 值(a)乘上在時間減去一(Μ)時之平順值。此量經加至在時 間t(r(t))時之列值乘上在零與一之間的純量值(a)。 參閱圖8,系統100之施行利用如上述之一或多個背景模 式406(圖4)。一背景模式處理之施行或產生背景模式資料 406之組成份405,示如圖8。此背景模式處理組成份4〇5之 施行自動產生並動態更新背景模式,使得系·統之無伴隨操 作得以施行。 輸入資料802係由尺度模組404對此背景模式處理組成份 405之施行所提具。輸入可獲得每一時框,並在取樣處理 8〇3中取樣。取樣可包含擋住部分受控背景1〇4之目標物 10 5。對各樣素而言,一範圍值較單一值可呈現更佳的背 景104。藉由將此範圍效應納入背景模式中,在處理4〇7中 之擴張可更為緊密。貢獻多重時框資料至取樣使此範圍得 見’但在取樣時框時,如目標物1 〇5正在移動,則被目標 物105擔住之背景1 〇4部分亦隨之而增。所採用之時框最佳 值視在系統之特殊應用中所期目標物1〇5之移動而定。實 際上’對搜尋一手、10時框、呈現約〇 33秒之系統而言, 無需使目標物過度擋住背景部分,即足以觀察大部分的範 圍。如特殊背景模式經比較處理4〇8之比較,將數值中較 上邊界視為與背景104相容,則可將在多重時框中觀察到 的各像素最大值記錄為取樣值。如特殊背景模式4〇6經處 理408之比較,將數值中較低邊界視為與背景104相容,則 -24- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 543323543323 A7 ___ B7 V. Description of the invention (21) s (t) = (axr (t)) + ((ia) xs (tl)) In Equation 1, the smooth value at time t (s (t)) Equal to one minus the scalar value (a) times the smooth value at time minus one (M). This quantity is added to the column value at time t (r (t)) times the scalar value (a) between zero and one. Referring to Fig. 8, the implementation of the system 100 utilizes one or more of the background modes 406 (Fig. 4). The execution or generation of a background pattern processing component 405 of the background pattern data 406 is shown in FIG. 8. The execution of this background mode processing component 405 automatically generates and dynamically updates the background mode, so that the system's unaccompanied operation can be performed. The input data 802 is provided by the execution of the background module processing component 405 by the scale module 404. Enter each time frame available and sample in the sampling process 803. Sampling may include a target 105 that blocks part of the controlled background 104. For each voxel, a range value can present a better background 104 than a single value. By incorporating this range effect into the background model, the expansion in processing 407 can be made even closer. Contribute multiple time frame data to the sampling so that this range is visible. 'However, at the time of sampling, if the target object 105 is moving, the portion of the background 104 which is supported by the target object 105 will also increase. The optimal value of the time frame used depends on the expected movement of the target 105 in the special application of the system. In fact, for a system that searches for one hand, 10 o'clock, and presents for about 0.33 seconds, it is sufficient to observe most of the range without having to block the target excessively against the background part. If the special background mode is compared with the comparison process 408 and the upper boundary of the value is regarded as compatible with the background 104, then the maximum value of each pixel observed in the multiple time frame can be recorded as the sample value. If the comparison of the special background mode 4106 is processed 408, and the lower boundary of the value is regarded as compatible with the background 104, then -24- this paper size applies the Chinese National Standard (CNS) A4 specification (210 X 297 mm) 543323

AT B7 五、發明説明(22 可將在多重時框中觀察到的各像素最小值記錄為取樣值。 在背景模式處理組成份405之施行中,將自取樣處理803 之取樣加至具儲存位置之緩衝器804俾儲存η個取樣,其中 在此歷史中最後的取樣被置換掉。因此此歷史對各像素包 含η個取樣值。在緩衝器中的時距d係示新取樣所需速率而 定’並經下列方程式2加至歷史r中: 在此施行中,中間處理方塊805對各像素選擇其在以想 素表示之位置處,以受控背景104表示之決定值。一種在 處理方塊805中用以選擇受控背景1〇4代表值之方法,係選 擇各像素的η個取樣之中間值。對任一像素而言,在緩衝 器804中多個η個取樣值可代表目標物1〇5。經選擇之時距d 使其不若對在任何時距d内之d /2之累加時距或更長時距 中,目標物105將擋住受控背景1〇4之任一像素。因此,對 任一像素而言,大部分的取樣值將為背景1〇4表之,並且 因此取樣值之中位數將為背景1〇4所表值。 $景模式處理組成份405係可調整,只要對d /2時間經過 觀祭,任何對背景104之變化均將反映在中間處理方塊8〇5 的輸出中。開始時,此系統無需整個受控背景i 〇4均可 見,開始時目標物105可存在,但在提供輸出前,取樣需 被以時間d觀察。視需要可在系統啟始時限制目標物1〇5不 可存在,在此狀況下,可將第一觀察取樣值複製至緩衝器 804的所有n個取樣中,使此系統快點產生輸出。 受控背景104之任一像素時段將被目標物ι〇5擋住,並因 -25- 本紙張尺度適用中國國家標準(CNS) A4規格(210X 297公釐) 543323 A7 _____ B7 五、發明説明(23 ) 此時段(1係不系統之特跌麻闽,令 -rr 丁几竹沐應用叫疋。取樣數ri可對記憶體緩 衝器調整,並可以級數處理。 在先前的討論中,呈現由攝影機1〇1與1〇2所得之影像相 關或在其範圍内之目標物1〇5位置所得施行。如成功偵測 到目‘物105,且其座標可由圖3之偵測模組3〇8、3〇9在兩 攝影機視野205及206中找到,則這些座標組即足以在目標 區103範圍内回復目標物1〇5位置。在圖3所繪施行中,係 在組合模組3 12中計算目標物1〇5位置。 翻到圖9A及9B,所示係組合模組3 12之一施行。對各攝 影機101與102而言,目標物105在攝影機影像面9〇4上的位 置P 902經轉換至一角度905,在此描述中以冷表之,其係 在參考面上之測量值,法線係以攝影機丨〇 1、1 〇2之轉軸定 之。(實際上,該等軸並非完全平行而恰在單一平面上, 但此處所述處理係在誤差容忍範圍内)。藉由將攝影機 101、102趨近為攝影機之理想針孔模式,可趨近與攝影機 幅向所定向量906之相對角度冷。 如圖9A所示方程式3,說明趨近計算如次: f β = tan'1 — [pj 為趨近此角度冷,將逆正切施於焦距長度(f)除以在影像面 上投影至參考面與影像面X界上的位置P量。 為達最大精度,應修正本質攝影機參數(主點位置及影像 尺寸)以及因透鏡導致之徑向失真,此係藉由將失真位置 (由相對位置資訊310、311表之)轉換至理想位置為之。尤 -26- 本紙張尺度適用中國國家標準(CNS) A4规格(210 X 297公釐) 543323 A7 _____B7 發明説明(24 ) ^ ^ — 其疋,如攝影機10 1、10 2具理想針孔攝影機性質,則理想 位置係物體105投射在影像面904上的位置,藉由方程式3 可產生恰當角度。兹將Z. Zhang在“A nexible NewAT B7 V. Description of the invention (22 The minimum value of each pixel observed in the multiple time frame can be recorded as the sampling value. In the execution of the background mode processing component 405, the sampling of the self-sampling processing 803 is added to the storage location. The buffer 804 俾 stores n samples, in which the last sample is replaced in this history. Therefore, this history contains n samples for each pixel. The time interval d in the buffer indicates the rate required for the new samples. Set 'and add it to the history r via the following Equation 2: In this implementation, the intermediate processing block 805 selects for each pixel its decision value at the position represented by the prime and the controlled background 104. One kind of processing block The method used in 805 to select the representative value of the controlled background 104 is to select the intermediate value of the n samples of each pixel. For any pixel, multiple n samples in the buffer 804 can represent the target object. 105. The time interval d is selected so that if it does not add up to d / 2 in any time interval d or longer, the target 105 will block any of the controlled backgrounds 104. Pixels, so for any pixel, most of the The sample value will be shown in the background 104, and the median sample value will be the value shown in the background 104. The processing component 405 of the scene mode can be adjusted, as long as the viewing time is d / 2, Any changes to the background 104 will be reflected in the output of the intermediate processing block 805. At the beginning, this system does not require the entire controlled background i04 to be visible. At the beginning, the target 105 may exist, but before the output is provided, Sampling needs to be observed at time d. If necessary, the target object 105 can be restricted from existing at the start of the system. In this case, the first observation sampling value can be copied to all n samples in the buffer 804, so that This system generates output quickly. Any pixel period of the controlled background 104 will be blocked by the target object ι05, and because of -25- this paper size applies the Chinese National Standard (CNS) A4 specification (210X 297 mm) 543323 A7 _____ B7 V. Description of the invention (23) At this time (1 is an unsystematic drop, it makes -rr Ding Jizhumu application called 疋. The number of samples ri can be adjusted to the memory buffer, and can be processed in series. In previous discussions, the presentations were made by cameras 101 and 102 The image is related to or obtained from the position of the target object within its range of 105. If the target object 105 is successfully detected, and its coordinates can be detected by the detection modules 3008 and 3009 in Fig. 3 in the field of view of the two cameras If found in 205 and 206, these coordinate groups are sufficient to restore the target 105 position within the range of the target area 103. In the implementation shown in FIG. 3, the target 105 position is calculated in the combination module 312. Turning to FIGS. 9A and 9B, the illustrated system is implemented by one of the combination modules 3 and 12. For each of the cameras 101 and 102, the position P 902 of the target 105 on the camera image surface 904 is converted to an angle 905, In this description, the cold table is used, which is the measured value on the reference plane, and the normal is determined by the rotation axis of the camera 01, 102. (Actually, the axes are not exactly parallel but on a single plane, but the processing described here is within error tolerance). By approaching the cameras 101, 102 as the ideal pinhole pattern of the camera, the relative angle from the camera's amplitude to the predetermined vector 906 can be cooled. As shown in Equation 3 in Equation 3, the approach calculation is as follows: f β = tan'1 — [pj is the temperature approaching this angle, and the inverse tangent is applied to the focal length (f) divided by the projection on the image plane to the reference The amount of position P on the X boundary between the plane and the image plane. For maximum accuracy, the essential camera parameters (principal point position and image size) and the radial distortion due to the lens should be modified. This is achieved by converting the distortion position (from the relative position information 310 and 311) to the ideal position as Of it. You-26- This paper size applies the Chinese National Standard (CNS) A4 specification (210 X 297 mm) 543323 A7 _____B7 Description of the invention (24) ^ ^ — such as camera 10 1 and 10 2 have ideal pinhole camera properties , The ideal position is the position where the object 105 is projected on the image plane 904, and an appropriate angle can be generated by Equation 3. Z. Zhang in "A nexible New

Technique f0r Camera Calibration,Micros〇ft Research, httP://reSearch.miCr〇S〇ft.com/〜zhang,,中提出的一組修正方 程式併於此以為參述。對諸多此系統之應用而言,已知此 趨近典需上述修正即具足量之精確度。 繼續描述組合模組3 12,如圖9B所示,定義一參考向量 907,使其通經在參考面上之兩攝影機與位置,其 中參考面之定義使得攝影機轉角定出參考面之法線。攝影 機之轉動角908係相對於參考向量9〇7測量而得。 一用以測量角度之公式示如方程式4 ·· a = β 〇七 β 角α之測量值等於角yj。與角冷之和。 施用方程式4以測量目標物105與參考向量9〇7之夾角 909。該角度係以此處之α表之。對各攝影機1〇1與丨〇2之 角α 909以及參考向量907長度即足以由方程式5與6找出目 標物105在參考面上的位置。A set of correction procedures proposed in Technique f0r Camera Calibration, Microsft Research, httP://reSearch.miCr0Sft.com/~zhang ,, and is hereby incorporated by reference. For many applications of this system, it is known that this approach requires the above corrections to have sufficient accuracy. Continuing the description of the combination module 312, as shown in FIG. 9B, a reference vector 907 is defined to pass through two cameras and positions on the reference plane, where the definition of the reference plane makes the camera's corner determine the normal of the reference plane. The camera rotation angle 908 is measured with respect to the reference vector 907. A formula for measuring angles is shown in Equation 4. a = β 〇 β The measured value of the angle α is equal to the angle yj. And the horn cold sum. Equation 4 is applied to measure the angle 909 between the target 105 and the reference vector 907. The angle is expressed in α here. The angle α 909 and the length of the reference vector 907 for the cameras 101 and 102 are sufficient to find the position of the target object 105 on the reference plane from equations 5 and 6.

方程式5可以公式計算出目標物之偏移 —wtanaAtanaBEquation 5 can calculate the offset of the target —wtanaAtanaB

tanaA+tanaB 偏私(y)等於對攝影機Α ιοί之角度(aA)正切值與對攝影機 B 102<角度(aB)正切值和的倒數乘上向量長度9〇7(w)、對 攝影機A 1〇1之角度(aA)正切值以及對攝影機Β 1〇2之角度 (aB)正切值。 -27- 本紙戒尺度通用中國國家標準(CNS) A4規格(2ι〇χ 公釐) 543323 A7 __ B7 _ 五、發明説明(25 ) 方程式6可計算出目標物偏移(χ A )如次: y xA=-tanaA + tanaB partiality (y) is equal to the angle (aA) of the camera A ιοί and the reciprocal of the sum of the tangent of the camera 102 102 < angle (aB) times the vector length 907 (w), to camera A 1 The angle (aA) tangent of 1 and the angle (aB) tangent to the camera B 102. -27- The paper or the scale is in accordance with the Chinese National Standard (CNS) A4 specification (2ι〇χ mm) 543323 A7 __ B7 _ 5. Description of the invention (25) Equation 6 can calculate the target offset (χ A) as follows: y xA =-

tanaA 在方程式6中,偏移(Xa)係由方程式5之偏移(y)除以對攝影 機A 101之角度(aA)正切值測量而得。 目標物105在與參考面垂直之軸上位置可由方程式7得 知’其係利用目標物1 〇 5與攝影機之距離,施用於各影像 中的位置上。 z=l — f 在方%式7中’位置(z)之計算係為在影像面上的位置& ) 投射至與方程式3所採用之面垂直之影像面向量,除以焦 距長度(f)乘上目標物105與攝影機之距離(1)。 這些關係提供了目標物105相對於攝影機a :^丨之座標。 知悉目標區10 3相對於攝影機A 101之位置及尺寸,即可轉 換座標,使其與目標區1〇3相對,如圖3之3 12。 視需要可在圖3所示系統之施行微調模組313中施加平順 化於這些座標。平順化係一合併先前結果之處理,使得自 時框至時框穩定移動。在此描述一種平順化這些特定座標 值(由組合模組3丨2所得之χ a、y、z )之方法。各座標值組 成份均與目標物1〇5有關,亦即X、y與z經獨立與動態平順 化。抑制度S係由方程式8計算之,其中S係響應於位置變 化而動態獨立調整,其計算如次: -28- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公漦) 543323 A7 B7 五、發明説明(26 S = ^(l-cc)SA S8 其中 a =tanaA In Equation 6, the offset (Xa) is calculated by dividing the offset (y) of Equation 5 by the angle (aA) tangent of the camera A 101. The position of the target 105 on the axis perpendicular to the reference plane can be obtained from Equation 7 ', which uses the distance between the target 105 and the camera to apply the position in each image. z = l — f's position (z) in Equation 7 is calculated as the position on the image plane &) Projected onto the image plane vector perpendicular to the plane used by Equation 3, divided by the focal length (f ) Multiply the distance between the target 105 and the camera (1). These relationships provide the coordinates of the target 105 relative to the camera a: ^ 丨. Knowing the position and size of the target area 103 with respect to the camera A 101, the coordinates can be changed so that it is opposite to the target area 103, as shown in Fig. 3-12. If necessary, smoothness can be applied to these coordinates in the execution trim module 313 of the system shown in FIG. Smoothing is a process that combines previous results to make the frame move steadily from time frame to time frame. A method for smoothing these specific coordinate values (χ a, y, z obtained from the combination module 3, 2) is described here. The components of each coordinate value group are related to the target 105, that is, X, y, and z are independently and dynamically smoothed. The degree of suppression S is calculated by Equation 8, where S is dynamically and independently adjusted in response to a change in position. The calculation is as follows: -28- This paper size applies to China National Standard (CNS) A4 (210 X 297 mm) 543323 A7 B7 V. Description of the invention (26 S = ^ (l-cc) SA S8 where a =

Ds-Da if{D<DA) M〇a<D<Db) if{D>D3) βΗκ,)-冲-1)丨 在方程式8中,s(t)係在t時之平順值,^⑴係在“寺之自然 值,DA及DB為臨界值,而、及、則定義抑制度。 兩距離臨界值DA及DB ,示如圖丨〇,定義三個範圍之運 動。位置變化小於Da,則運動係為%所重度抑制丨〇〇 i, 藉此降低一值在兩接近值間來回切換之趨勢(影像分離取 樣之次效應)。位置變化大於Db,則運動係為%所輕度抑 制1002,或未受抑制。此舉可降低或消弭在一些其它平順 化程序中引入之延遲及模糊狀態。對運動之抑制度在Da及 db間變化,其區域標註如1〇〇3 ,使得在輕度與重度抑制間 之過渡較不顯著。施於方程式丨的量a ,可見於方程式9如 次:Ds-Da if {D < DA) M〇a &D; D < Db) if {D > D3) βΗκ,)-rush-1) 丨 In Equation 8, s (t) is the smooth value at t, ^ It is based on the "natural value of the temple, DA and DB are critical values, and, and, and define the degree of inhibition. The two distance critical values DA and DB, as shown in Figure 丨, define three ranges of movement. Position change is less than Da , Then the motion system is severely suppressed by%, which reduces the tendency of a value to switch back and forth between two close values (the second effect of image separation sampling). If the position change is greater than Db, the motion system is slightly reduced by% Inhibit 1002, or not be suppressed. This can reduce or eliminate the delay and blur state introduced in some other smoothing programs. The degree of inhibition of motion varies between Da and db, and its area is marked as 1003, making The transition between mild and severe suppression is less pronounced. The quantity a applied to equation 丨 can be seen in equation 9 as follows:

S 在方私式9中’量(a)受限使其等於或大於零,並小於或等 於一,抑制值S見於方程式8,而e係自先前時框之消逝 間。 如可尋得這些目標物1〇5座標314,其典型經傳送至另一 處理使用,諸如使用者應用程式3 16。可將其傳送至另一 處理中’其執行與影像處理器1〇6實行之上述計算相同, 或傳运至另—計算裝置中。在此將資料傳送至應用程式 -29- 543323 五、發明説明(27 3 16的万法中,可包含仿同傳統使用者輸 及鍵盤)使得系統且在鹿 "'^ 、…用私式316範圍内既有控制功能之 控制。可對所有為攝影機 η辦以钱所捕捉又視訊影像時 物105座標314,其中_葙却义飧T汁目‘ 視訊#像時框典型每秒被捕捉30次 或更4。造成使用者動作與應用反應間之稍許潛伏。 在系統之典型施用中’應用程式316具使用者反饋,並 係精由將-指標之視覺代表顯示於攝影顯示⑽上。使得 指標移動’致使其位置及移動仿若目標物1〇5之移動(點行 為使用者的手)。 在-使用者介面型式之變化中,此指標,諸如滑鼠游 標,係示於其它圖像前方’且其移動映於由勞幕表面限定 之兩尺寸空間。此控制型式與電腦滑鼠功能類似,諸如在 微軟視窗操作系統中所用之者。_應用反饋影像示例使用 如圖11A中所示控制型式1102。 參閱圖11A (簡示如圖3),影像處理器i 〇6亦包含一視需 要加入之座標再應對處理317(圖3)。此座標再應對處理317 係操作將整體呈現及位置座標314(與目標物1〇5有關)再映 至指標1101(諸如游標或滑鼠指標)在影像1102上重叠位 置,此係藉由方程式10對X座標以及此方程式之等效方程 ^二對y座標為之如次·· 如 xh < bt 一·亡如 bt-xh 幺 K br -〇ι 1 如 xh > br -30- 543323The amount (a) of S in the formula 9 is limited such that it is equal to or greater than zero and less than or equal to one. The suppression value S is found in equation 8, and e is the elapsed time from the previous time frame. If these targets 105 coordinates 314 can be found, they are typically transmitted to another process for use, such as the user application 3 16. It can be transferred to another process' which performs the same calculations as described above performed by the image processor 106 or to another computing device. Send the data to the application program here. 29- 543323 V. Description of the invention (27.16 million methods, which can include the traditional user input and keyboard) make the system and use the private type in the deer " '^, ... Control of existing control functions in the range of 316. It can be used for all the coordinates 105 of the video image objects captured by the camera with money, including 314, of which the video frame is typically captured 30 times or 4 times per second. Causes a slight latency between user action and application response. In a typical application of the system, the 'application 316 has user feedback, and the visual representation of the-indicator is displayed on the photographic display screen. Make the indicator move 'so that its position and movement are similar to the movement of the target 105 (the point is the user's hand). In the variation of the user interface type, this indicator, such as a mouse cursor, is shown in front of other images' and its movement is reflected in the two-dimensional space defined by the surface of the curtain. This type of control is similar to computer mouse functions, such as those used in Microsoft Windows operating systems. _Application feedback image example uses control pattern 1102 as shown in Figure 11A. Referring to FIG. 11A (as shown in FIG. 3 for simplicity), the image processor i 〇6 also includes a coordinate re-processing 317 (FIG. 3) as needed. This coordinate then responds to the processing of the 317 series operation and re-maps the overall presentation and position coordinates 314 (related to the target 105) to the position where the index 1101 (such as a cursor or mouse pointer) overlaps on the image 1102, using Equation 10 For the X coordinate and the equivalent equation of this equation ^ The second to the y coordinate is as follows ... as xh < bt one · as bt-xh 幺 K br -〇 1 as xh > br -30- 543323

AT __ B7 五、發明説明(28 ) 一 在方程式10中’ X h係為與目標物1〇5有關之座標位置 :> 1 4 ’ X e係為螢幕上游標位置,映對〇 · 1以b ^及b r則係在目 標區103範圍内之次區域的左右邊界。如圖丨丨B所示,整個 顯示區1102係以完全包含於目標區1〇3内之次區域11〇3表 之。在次區域1103内的位置(例如位置A 11〇5)經線性映至 顯示1102内位置(例如11〇6)。在次區域n〇3外但仍在目標 區103内的位置(例如位置B 1107)經映至顯示區1102邊界上 最近的位置(例如1108)。此舉可降低使用者可能非預期自 次區域移動目標物105(常為使用者的手或指標手指),同時 嘗試將指標1101移動至接近顯示邊界位置。 在目標區103即在攝影顯示107之前的情境中,可將次區 域1103定為與攝影顯示1〇7對準,使得指標11〇1將看起來 對準目標物105。如目標區1〇3相當薄,例如小於5cm,並 以此方式界定次區域11 03,則此系統以使用者交互作用方 式趨近於唉呦螢幕,不以攝影顯示1〇7為限,且攝影顯示 107表面與使用者之間無需直接接觸(例如:攝影顯示與使 用者可在視窗的相反側)。將可瞭解到,系統1〇〇可併各式 尺寸顯示螢幕使用,並可不僅包含電腦顯示器(是否為CRT 或LCD型顯示器),並可包含背投射式電視監視器' 大平面 螢幕LCD監視器以及前投射式顯示系統。 在目標區10 3並非立即在攝影顯示1 〇 7之前的情境中,且 在幅向計算處理609中找到的目標物幅向之活動影像區2〇8 夠深,可將一向量自目標物所在處延伸至攝影顯示丨〇7 , 其係利用幅向角來偵測使用者在攝影顯示上的指向位置。 -31 - 本紙張尺度適用中國國家標準(CNS) A4規格(210X297公釐) ' ---------- 543323AT __ B7 V. Description of the invention (28)-In Equation 10, 'X h is the coordinate position related to the target 105; > 1 4' X e is the position of the screen upstream index, which is mapped to 0.1 B ^ and br are the left and right borders of the sub-region within the range of the target region 103. As shown in Figure 丨 丨 B, the entire display area 1102 is represented by the secondary area 1103 which is completely contained in the target area 103. The position within the sub-region 1103 (for example, position A 1105) is linearly mapped to the position within the display 1102 (for example, 1106). A position (for example, position B 1107) outside the sub-region no3 but still within the target area 103 is mapped to the nearest position (for example, 1108) on the boundary of the display area 1102. This can reduce the possibility that the user may unexpectedly move the target 105 (usually the user's hand or the index finger) from the sub-region, while trying to move the index 1101 to a position close to the display boundary. In the situation where the target area 103 is before the photographic display 107, the sub-area 1103 may be aligned with the photographic display 107, so that the index 1101 will appear to be aligned with the target 105. If the target area 103 is relatively thin, for example, less than 5cm, and the subarea 11 03 is defined in this way, the system approaches the screen with user interaction, not limited to the photographic display 1007, and There is no need for direct contact between the surface of the photographic display 107 and the user (for example: the photographic display and the user can be on the opposite side of the window). It will be understood that the system 100 can be used with various sizes of display screens, and can include not only computer monitors (whether CRT or LCD type monitors), but also rear-projection TV monitors' large flat-screen LCD monitors. And a front projection display system. In the situation where the target area 10 3 is not immediately before the photographic display 1 07, and the moving image area 2 08 of the target object width direction found in the width direction calculation processing 609 is deep enough, a vector can be drawn from the target object It extends to the photographic display, which uses the vertical angle to detect the user's pointing position on the photographic display. -31-This paper size applies to China National Standard (CNS) A4 (210X297 mm) '---------- 543323

AT ___ B7 五、發明説明(29 ) 但最常見的是,活動影像區208之深度不足以在處理方 塊609中精確計算幅向。在這些目標區ι〇3並非立即在攝影 顯示107之前的情境中,並且幅向無從計算,在次區域 1103小於攝影顯示處可施用方程式丨〇。接著處理器將目標 物105的絕對位置映至位置指標,使得目標物ι〇5的移動經 尺寸調整至位置指標位置在攝影顯示上的大位置移動,使 得整個攝影顯示區易於為使用者所及(例如可將次區域 1103寬度至多定為750mm並與高度成正比,其大小為大部 分的使用者所易及)。當以此方式設定時,此系統仍使得 使用者具有指向螢幕的感覺。 在此型使用者介面之另一變化中,使用者使得指標在三 維虚擬實境中移動(示例示如圖丨2A及圖12B )。可利用投射 轉換呈現此虛擬實境,使得虛擬實境深度内含呈現於攝影 顯示107上之影像。分割此類虛擬實境之技術包含 OpenGL。方程式10係用以再映對X、y與z座標(例如:次 區域1103變成一立方體)。 應用係受控於可移動之在螢幕上指標(例如:圖丨1A、 12A及12B ),其中的控制前已討論,典型呈現資料圖像表 示或交互構件(例如:按鈕1109或物體表示12〇2)。使用者 期使將指標11 〇 1置於這些物體之一上,或者如呈現三維虛 擬實境’則與物體相接觸或互動。對二維介面而言,此狀 況可藉由比較再映對指標位置丨1〇6與物體圖像表示邊界 (例如1110)而偵測之,其中如指標位置在物體邊界範圍 内’則此狀況為真。對三維介面而言,此狀況可藉由比較 -32-AT ___ B7 V. Description of the invention (29) But the most common one is that the depth of the moving image area 208 is not enough to accurately calculate the amplitude direction in the processing block 609. In these situations where the target area ι03 is not immediately before the photographic display 107, and the width direction cannot be calculated, the equation 1 can be applied in the sub-region 1103 which is smaller than the photographic display. Then the processor maps the absolute position of the target object 105 to the position indicator, so that the movement of the target object ι05 is adjusted to a large position on the photographic display, and the entire photographic display area is easy for the user (For example, the width of the sub-region 1103 can be set to 750 mm at most and proportional to the height, and its size is accessible to most users). When set in this way, this system still gives the user the feeling of pointing at the screen. In another variation of this type of user interface, the user moves the indicator in a three-dimensional virtual reality (examples are shown in Figures 2A and 12B). This virtual reality can be rendered using a projection transformation, so that the depth of the virtual reality includes the image presented on the photographic display 107. Techniques for segmenting such virtual reality include OpenGL. Equation 10 is used to remap the X, y, and z coordinates (for example, the subregion 1103 becomes a cube). The application is controlled by a movable on-screen indicator (for example: Figures 1A, 12A, and 12B). The control in it has been discussed before. Typical presentation of data image representation or interactive components (for example: button 1109 or object representation 12). 2). The user expects to place the index 1101 on one of these objects, or to contact or interact with the object if a three-dimensional virtual reality is presented. For a two-dimensional interface, this condition can be detected by comparing and re-mapping the position of the indicator, which is 1060, and the object image representation boundary (for example, 1110), and if the indicator position is within the boundary of the object, then this condition Is true. For 3D interfaces, this situation can be compared by -32-

543323 A7 _ B7 五、發明説明(3〇 ) 整體指標1101或如需較精細之控制,則為部分指標邊界 1203與物體1202之邊界1204。使用者視需要可接收用以指 示游標位於物體上之反饋。反饋可為多種型式,包含_音 頻提示及/或游標與物體之一或兩者圖像表示之變化。接 著使用者可啟動、複製或移動在游標下的物體。使用者可 期按其意念做一個手勢來啟動、複製或移動物體。 目標物10 5之移動視需要可為手勢偵測處理3 1 5轉譯與分 類,如上述圖3。手勢偵測處理3 1 5可利用自系統任一部件 產生之資料。可隨時間取樣最終座標314、影像座標31〇與 3 11或3 10、3 11及3 14之組合,並做為至手勢偵測處理3 ! 5 之輸入。各種手勢(例如:覆蓋與好已可成功地利用此資 料偵測為至手勢偵測處理3丨5之輸入)。 在應用狀態已知並經傳送至手勢偵測模組3丨5的情境下 (亦即是否指標1101覆於按鈕11〇9之上)。一種使用者用以 指不在游標1101下欲啟動物體(例如螢幕物體11〇9、12〇2) 之手勢’使得游標盤旋於物體上(例如丨1〇9、1202)較預定 時程長的時間。此由使用者執行之手勢的偵測係當應用狀 感在預定時段保持不變時,藉由偵測應用狀態及觸發手勢 為之。無須特別針對多重攝影機控制系統1〇〇產生此應 用’既存的技術即可隨意監控應用狀態(在視窗操作系統 中利用視窗S D K功能“etWindowsHookEx”設定一 ook), 以及仿同滑鼠輕拍(在視窗操作系統中,利用視窗Sdk功能 “endlnput,,)° 在一些情境中’可能無法獲得應用狀態並可能不被監 -33 - 本紙張尺度適用中國國家標準(CNS) A4規格(210X 297公釐了 543323543323 A7 _ B7 V. Description of the invention (30) The overall index 1101 or the finer control is the boundary 1204 of the partial index boundary 1203 and the object 1202. The user may receive feedback indicating that the cursor is positioned on the object as needed. The feedback can be of various types, including audio prompts and / or changes in one or both of the cursor and the object's image representation. The user can then start, copy or move the object under the cursor. The user can make a gesture to start, copy or move the object. The movement of the target 105 can be translated and classified as gesture detection processing 3 15 as needed, as shown in Figure 3 above. Gesture detection processing 3 1 5 can use data generated from any part of the system. The combination of the final coordinate 314, the image coordinate 310 and 3 11 or 3 10, 3 11 and 3 14 can be sampled over time and used as input to the gesture detection process 3! 5. Various gestures (for example: Coverage and Good can successfully use this data to detect the input to gesture detection processing 3, 5). In a situation where the application status is known and transmitted to the gesture detection module 3 丨 5 (that is, whether the indicator 1101 is overlaid on the button 1109). A gesture used by a user to indicate that an object (eg, screen objects 1109, 1202) is not to be activated under the cursor 1101, causing the cursor to hover over the object (eg, 109, 1202) for a longer time than a predetermined duration . The detection of the gesture performed by the user is to detect the application state and trigger the gesture when the application state remains unchanged for a predetermined period of time. It is not necessary to generate this application specifically for the multi-camera control system 100. Existing technology can be used to monitor the application status at will (using the Windows SDK function "etWindowsHookEx" in the Windows operating system to set an ook), as well as flicking with the mouse (in In the Windows operating system, the use of the Windows SDK function "endlnput ,,") ° In some scenarios, 'application status may not be obtained and may not be monitored -33-This paper size applies the Chinese National Standard (CNS) A4 specification (210X 297 mm) 543323

AT ____B7 五、發明説明(31 ~) ' 控。在此狀況下,-些用以指示在游標11〇1下所欲啟動物 體之示範性手勢(例如螢幕物體1109、12〇2)保持手的穩定 覆蓋或手前後快速撥弄。 一種偵測覆蓋之方法係藉由保存目標物1〇5之位置歷史 為之,其中該歷史包含在-預定時段内所有的位置及狀態 記錄,終結於最近的取樣。該時段係表使用者須保持手不 動的最低時段。在三維(X,y,z)中極小與極大位置可見於歷 史中。如在歷史之所有取樣中,目標物1〇5出現在目標區 1〇3中,並且極小與極大間距離在三維之預定臨界值範圍 内’則報告覆蓋手勢。該等距離臨界值係表可容許目標物 105移動之最大量以及最大變化量,或是it_期藉系統之 各式部件引入手位置。在典型方法中會報告此手勢,其中 系統仿同滑氣如上述,俾仿同滑藏輕拍。手勢代表滑氣的 附加操作、按兩下及按壓,亦經偵測即仿同操作。 此外,視需要可偵測與指標相對於物體位置無關之手 勢,並以可能或可不與應用狀態有關之應用賦予意義。利 用此型互動之應用典型並未明確使用或顯示目標物位置 3J7或其它位置。這些應用之整體或主要係僅由此系統所 解讀之位置來控制。這些應用亦無需特別針對此系統創 造,此係因此系統可用以模擬傳統使用者輸入裝置上執行 的動作,諸如鍵盤或操縱桿。 堵多有用的解謂直接視在目標區i 03内的目標物1 〇5的絕 對位置而定(或者在次區域11〇3内之指標位置丨1〇5可以等效 方式使用)。一種製造此解讀之方法係為定義箱、平面或 -34 - 本紙張尺度適用中國國家標準(CNS) A4規格(210X297公爱) 543323 A7 _____B7 五、發明説明(32 ' 其i外型。一種狀態為觸發,如在第一箱内(或在由第一 平面界定之邊界外)找到目標物1〇5位置(例如為方塊314界 定之位置,或者自再應對處理317再應對座標),且其在剛 剛的觀察中並不存在(不輪其在目標區1〇3以外區域或是未 偵測到)。此狀態維持直到在第二箱内找不到手的位置(或 在由第二平面界定之邊界外),適時該狀態關閉。第二箱 須包含整個第一箱,並典型大些。使用較大的箱子在偵測 到目標物105接近箱的邊界時,可減少狀態非所欲之觸發 或關閉,在該處細微的動作或少量的影像信號雜訊即會使 得位置3 17在箱内外漂移。典型使所欲採用的手勢而採用 解漬此狀悲、的二種方法之一。在一種方法中,手勢直接反 映觸發或關閉狀態。當仿同一鍵盤鍵或操縱桿擊發按鈕 時,在觸發狀態時按壓,在關閉狀態時放鬆。在另一方法 中,手勢僅將狀怨自關閉過渡至啟動。當仿同一鍵盤鍵或 操縱桿按鈕時,按下鍵。雖然未報告時段及關閉狀態至應 用處,仍保持使得手勢不會重複直到狀態關閉後,使得手 勢之各情況須由使用者清楚地界定義圖。第三種方法係在 狀態自關閉過渡至啟動時觸發手勢,並週期性以預定時段 再觸發丁勢,使得狀態保持開啟。此係仿同在鍵盤上保持 一鍵向下,使得字元在一些應用中重複。 在一種上述技術方式中,可將箱或面界定於目標區103 内’兹述足如次。藉由定出第一面(圖13A中的ι5〇1)及第 一面1502 ’將目標區分割成擊發1503及中央15〇4區(當目標 物105在平面間的區域15〇5中時,視物體先前位置報告手 -35-AT ____B7 V. Description of the Invention (31 ~) 'Control. In this case, some exemplary gestures (eg, screen objects 1109, 1202) to indicate the object to be activated under the cursor 1101, to maintain a stable hand coverage or quickly fiddle with the hand. A method of detecting coverage is to save the position history of the target object 105, where the history includes all position and status records within a predetermined period of time, and ends at the most recent sampling. This period is the minimum period during which the watch user must keep his hands still. The minimum and maximum positions in three dimensions (X, y, z) can be seen in history. For example, in all samplings in history, the target object 105 appears in the target region 103, and the distance between the minimum and maximum values is within a three-dimensional predetermined threshold value ', then the coverage gesture is reported. These distance critical values are the maximum and maximum changes that the target 105 can allow, or it can be introduced into the hand position by various components of the system. This gesture is reported in a typical method, where the system mimics the same movement as described above, and mimics the same gesture as the tap. Gestures represent additional operations of gas-pressing, double-clicking and pressing, and they are similarly detected after being detected. In addition, if necessary, gestures that are not related to the position of the indicator relative to the object can be detected and given meaning with applications that may or may not be related to the application status. Applications that take advantage of this type of interaction typically do not explicitly use or display the target location 3J7 or other locations. These applications as a whole or mainly are controlled only by the position interpreted by the system. These applications also do not need to be created specifically for this system, so the system can be used to simulate actions performed on traditional user input devices, such as a keyboard or joystick. The useful solution of Dudu depends directly on the absolute position of the target object 105 in the target area i 03 (or the index position in the sub-region 1103 can be used in an equivalent manner). One way to make this interpretation is to define a box, a flat surface, or -34-This paper size applies the Chinese National Standard (CNS) A4 specification (210X297 public love) 543323 A7 _____B7 V. Description of the invention (32 'its i shape. A state To trigger, for example, find the target 105 position in the first box (or outside the boundary defined by the first plane) (for example, the position defined by block 314, or co-ordinates since re-processing 317), and its It does not exist in the observation just now (it is not in the area outside the target area 103 or is not detected). This state is maintained until the position where the hand is not found in the second box (or is defined by the second plane Outside the boundary), the state will be closed in due time. The second box must contain the entire first box, which is typically larger. Use of a larger box can reduce the undesired state when the target 105 is detected near the box boundary. Trigger or close, where a small movement or a small amount of image signal noise will cause position 3 17 to drift inside and outside the box. Typically, one of the two methods is used to resolve the sadness of the gestures you want to use. In one method Gestures directly reflect the triggering or closing state. When the button is fired like the same keyboard key or joystick, press in the triggering state and relax in the off state. In another method, the gesture only transitions the complaint from off to on. When Simulate the same keyboard key or joystick button, press the key. Although the period is not reported and the closed state is applied to the application, the gesture will not be repeated until the state is closed, so that each situation of the gesture must be clearly defined by the user. The third method is to trigger a gesture when the state transitions from off to on, and then trigger the Ding potential periodically at a predetermined period to keep the state on. This is similar to keeping a key down on the keyboard so that the characters are in the It is repeated in some applications. In one of the above-mentioned technical methods, the box or surface can be defined within the target area 103. It is described as follows. By defining the first surface (ι501) in FIG. 13A and the first surface 1502 'Segment the target area into the firing area 1503 and the central area 1504 (when the target 105 is in the area 1505 between the planes, report the hand based on the previous position of the object -35-

543323 A7 _____B7 五、發明説明(33—) ' ' " ~ 勢如上述)上述技術可偵測目標物1〇5向前推(典型為手), 孩手勢係用以仿同操縱桿上的擊發紐,或使得應用以與按 壓操縱桿按鈕共同相關方式響應(例如在虛擬遊戲中的武 器擊發)。 在另一種上述技術方式中,可將箱或面界定於目標區 1〇3内,茲述之如次。定出第一型平面15〇6、15〇7、 1508、1509以將目標區103分割成左、右、上及下部,在 用洛區重疊,如圖13B所示。第二行平面標示如151〇、 1511、1512、1513。各第一及第二平面對之處理獨立。此 平面組仿同四個方向游標鍵,其中在角落的手觸發兩鍵, 一般為终多應用解讀為四個辅助45度(對角)方向。在此方 法中仿同鍵盤游標使得各式既存應用為系統1〇〇所控制, 包含如微軟P 〇 w e r P 〇 i n t,其響應於仿同游標鍵(例如上下 箭號鍵)在顯現序列中移動次一或前一投影片。 另一種適用於應用中的尋常仿同控制法,期使四個45度 方向狀態精確顯現。箱1 5 14、1 5 1 5、15 16、1 5 17係對四個 主要方向而定(水平與垂直),而附加之箱1518、1519、 1520、1521則係對四個輔助45度(對角)方向而定,示如圖 13C。為簡明之故,僅說明第一型箱。在這些箱間置放一 間隙。圖13D說明如何定出相鄰箱。在第一型箱1522、 1 523間間隙可確保使用者意欲使得目標物1 〇5進入箱中, 同時藉由重疊第二型箱1 525、1526來填充間隙1524,使得 系統報告先前手勢直到使用者清楚地欲將目標物105移至 相鄰箱中或中央中性區中。此按鈕組可用以仿同八個方向 -36- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 543323 A7 B7 五、發明説明(34 ) 的操縱桿鍵。 較廣類的手勢視運動而非或除位置之外。一示例係爿亭+ 擺動至左方的手勢。此係為傳達至應用中其欲回到前_胃 或狀態的手勢。經由仿同按鍵及滑鼠,此手勢可用以控制 資訊呈現軟體(尤其是微軟PowerPoint)至顯現序列中的前 一投影片。經由仿同按鍵及滑鼠,此手勢可使網頁壤J覽# 執行與其回復鍵相關聯的動作。類似地,將手擺動至右# 係用以傳達至應用中其欲到下一頁或狀態的手勢。例如, 此手勢使得呈現軟體至顯現序列中的次一投影片,並使# 瀏覽器軟體切換至次頁。 一種用以偵測手擺動至左方的方法述之如次。一沿目授 區103最左部之薄條帶定為左緣區。目標物1〇5位置(例如 由方塊3 14所定位置,或者自再應對處理3 17所定之再應對 座標)表為下列三種狀態: 1 ·目標物出現而不在左緣區内。 2. 目標物出現並在左緣區内。 3. 目標物未出現在手偵測區内。 以上自狀態1過渡至狀態2使得手勢偵測模組3 15進入— 71欠怨’藉此其啟動一计時器並寺待次一過渡。如在預定期 程内觀察到過渡至狀態3,及報告手擺動至左方手勢發 生。此技術一般可重複於右、較上及較下緣,並由於手的 位置可見於三維空間中,亦複製以偵測手拉回。 已討論過各種手勢偵測技術。在研究文獻中,尚有其它 手勢偵測技術(例如:Hidden Markov Layers),均適用於此 -37- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 543323543323 A7 _____B7 V. Description of the invention (33-) '' " ~ The above-mentioned technology can detect the target 105 pushing forward (typically a hand), and the child gesture is used to imitate the same as on the joystick. Fire a button, or cause the application to respond in a manner that is related to pressing the joystick button (such as a weapon fire in a virtual game). In another above-mentioned technical manner, the box or surface may be defined in the target area 103, which is described as follows. The first type planes 1506, 1507, 1508, and 1509 are determined to divide the target area 103 into left, right, upper, and lower portions, and overlap the active area, as shown in FIG. 13B. The second line of the plane is labeled 1510, 1511, 1512, 1513. Each of the first and second planes is treated independently. This plane group is similar to the four direction cursor keys, in which the two keys are triggered by the hand in the corner, which is generally interpreted as four auxiliary 45-degree (diagonal) directions for the final multi-application. In this method, the pseudo-keyboard cursor makes various existing applications controlled by the system 100, including, for example, Microsoft Power Point, which responds to the movement of the pseudo-cursor keys (such as the up and down arrow keys) in the presentation sequence. Next or previous slide. Another common analog control method applicable to the application is to accurately display the four 45-degree directions. Boxes 1 5 14, 1, 5 1 5, 15, 16, 1 5 17 are based on the four main directions (horizontal and vertical), while the additional boxes 1518, 1519, 1520, and 1521 are 45 degrees to the four auxiliary ( Diagonal) direction, as shown in Figure 13C. For brevity, only the first type of box will be described. Place a gap between these boxes. FIG. 13D illustrates how to locate adjacent boxes. The gap between the first type boxes 1522 and 1 523 can ensure that the user intends to let the target 105 enter the box, and at the same time, the gap 1524 is filled by overlapping the second type boxes 1 525 and 1526, so that the system reports the previous gesture until use The user clearly wants to move the target 105 into an adjacent box or a central neutral zone. This button group can be used to imitate the same eight directions. -36- This paper size is applicable to the Chinese National Standard (CNS) A4 specification (210 X 297 mm) 543323 A7 B7 5. The joystick key of the invention description (34). The broader class of gestures depends on motion rather than or in addition to position. An example is 爿 亭 + the gesture of swinging to the left. This is a gesture that is communicated to the app that it wants to return to the previous stomach or state. This gesture can be used to control information presentation software (especially Microsoft PowerPoint) to the previous slide in the presentation sequence by using the same keys and mouse. This gesture allows the web page to perform the actions associated with its reply key, using the same keys and mouse. Similarly, swinging the hand to the right # is a gesture used to communicate to the app that it wants to go to the next page or state. For example, this gesture causes the rendering software to go to the next slide in the presentation sequence and causes #browser software to switch to the next page. A method for detecting hand swing to the left is described as follows. A thin strip along the left-most part of the heading area 103 is designated as the left margin area. The position of the target 105 (for example, the position set by the square 3 14 or the re-coordinates determined by the self-reaction processing 3 17) is shown in the following three states: 1 The target appears but is not in the left margin area. 2. The target appears and is in the left margin area. 3. The target does not appear in the hand detection area. The above transition from state 1 to state 2 causes the gesture detection module 3 15 to enter-71 due to complaints', thereby starting a timer and waiting for the next transition. Such as the transition to state 3 is observed within the predetermined period, and the hand gesture of reporting to the left is reported. This technique can generally be repeated on the right, upper, and lower edges, and because the position of the hand can be seen in three-dimensional space, it is also copied to detect pull back. Various gesture detection techniques have been discussed. In the research literature, there are other gesture detection technologies (for example: Hidden Markov Layers), which are applicable to this -37- This paper size applies to China National Standard (CNS) A4 (210 X 297 mm) 543323

AT _____ B7 五、發明説明(35 ) 處所述系統100之各式施行中。 回來參閱圖1及3,將另一多重攝影機控制系統1〇〇之施 行做更詳盡的描述。隨然圖1所示係雙攝影機系統,應知 可將影像處理器106配置為可接收來自多於兩台攝影機之 輸入’在特殊應用中,並可包含四或多個拍攝攝影機。在 四個攝影機之施行中,圖3部件3〇4-311經複製以支撐兩附 加攝影機。此外,組合模組3 12經配置以接收搜尋四組與 攝影機相關的目標物1 05相關呈現及位置資料(類似於資料 3 10及311)。前述之技術及方程式(尤其是方程式5及6)適用 於附加攝影機對,其中組合模組3 12之輸出係來自各攝影 機對所有位置的平均值。手勢偵測模組3 1 5之配置類似, 俾自兩大抵上類似於偵測模組3 10及3 11之附加偵測模組 (與3 0 8、3 0 9類似)接收四組與攝影機相關的呈現及位置資 料310 、 311 。 自影像處理器106之輸出目前包含經處理之物體位置座 標以及與四台攝影機有關之手勢資訊,其可為另一處理或 使用者應用軟體3 16所用。用以計算來自兩附加攝影機與 目標物105相關之座標資訊的公式及幾何(上述)已可採用。 在一使用四台攝影機的施行中,兩附加攝影機係位於受 控背景104的下方兩角落上,並轉動使得目標區1 〇3位於各 攝影機的視野205内。四台攝影機系統的優點在於在目標 物105的位置搜尋上的精確度較高。因此,應用程式可包 含在攝影顯示10 7上較高密度之較多螢幕物體,此係因搜 尋精確度增加,使得物體可更為接近,而可正確選擇目木^ -38- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 543323 A7 B7 五、發明説明(36 ) 物105之些微移動。此外,當部分目標物1 〇5擋住與一或更 多其它攝影機相關的視野205時,兩附加攝影機可降低在 搜尋目標物105上的誤差。 雖已描述多種施行方式,應可瞭解可做各式修改。據上 述,其它在下列申請專利範圍之範疇内的施行方式。 -39- 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐)AT _____ B7 V. Invention Description (35) Various implementations of the system 100 are described. Referring back to FIGS. 1 and 3, the implementation of another multiple camera control system 100 is described in more detail. Even with the dual camera system shown in Figure 1, it should be understood that the image processor 106 may be configured to receive input from more than two cameras' in special applications and may include four or more camera cameras. In the implementation of the four cameras, the parts 30-311 of Fig. 3 are reproduced to support the two additional cameras. In addition, the combination module 3 12 is configured to receive and search four sets of camera-related targets 1 05 related presentation and location data (similar to data 3 10 and 311). The aforementioned techniques and equations (especially equations 5 and 6) are applicable to additional camera pairs, where the output of the combination module 3 12 is the average value from all positions of each camera pair. The configuration of the gesture detection module 3 1 5 is similar. From the two major detection modules similar to the detection modules 3 10 and 3 11 (similar to 3 0 8 and 3 0 9), it receives four sets of cameras and cameras. Relevant presentation and location information 310, 311. The output from the image processor 106 currently contains processed object position coordinates and gesture information related to the four cameras, which can be used by another process or user application software 316. The formula and geometry (above) used to calculate the coordinate information related to the target 105 from the two additional cameras are available. In an implementation using four cameras, two additional cameras are located on the two corners below the controlled background 104, and are rotated so that the target area 103 is within the field of view 205 of each camera. The advantage of the four-camera system is that the position search accuracy of the target 105 is high. Therefore, the application can include more high-density screen objects on the photographic display 107. This is because the search accuracy is increased, the objects can be closer, and the correct selection of Mumu ^ -38- This paper size is applicable China National Standard (CNS) A4 specification (210 X 297 mm) 543323 A7 B7 5. Description of the invention (36) Some slight movement of the object 105. In addition, when a part of the object 105 blocks the field of view 205 associated with one or more other cameras, two additional cameras can reduce the error in searching for the object 105. Although various implementation methods have been described, it should be understood that various modifications can be made. According to the above, other implementation methods fall within the scope of the following patent applications. -39- This paper size applies to China National Standard (CNS) A4 (210 X 297 mm)

Claims (1)

543323 年月 铜見543323 Copper see 第090124363號專利申請案 中文申請專利範圍替換本(92年4月) 、申請專利範圍 1. 一種目標物之追蹤方法,該方法包括: 取得代表目標物之不同觀點之第一影像及第二影像· 將第一影像處理為第一影像資料集,並將第^二7'影處 理為第二影像資料集; 處理第一影像資料集與第二影像資料集以產生一與背 景有關之背景資料集; 〃 藉由決定第一影像資料集與背景資料集間之差里而產 生第一差異圖,以及藉由決定第二影像資料集與背景资 料集間之差異決定產生第二差異圖; 偵測在第一差異圖中目標物之第一相對位置,以及在 第二差異圖中目標物之第二相對位置;以及 產生自目標物之第一及第二相對位置距離目標物之絕 對位置。 2·如申請專利範圍第1項之方法,其中將第一影像處理為 第一影像資料集以及將第二影像處理為第二影像資料集 的步驟包括決定第一及第二影像之活動影像區,以及自 在活動影像區内之第一及第二影像擷取活動影像資料 集。 3·如申請專利範圍第2項之方法,其中擷取活動影像資料 集包括割除(cropping)第一及第二影像。 4·如申請專利範圍第2項之方法,其中擷取活動影像資料 集包括轉動第一及第二影像。 5 ·如申請專利範圍第2項之方法,其中擷取活動影像資料 集包括剪裁第一及第二影像。 本紙張尺度適用中國國家標準(CNS) A4規格(210 X 297公釐) 543323 申凊專利範圍 AB c D 92, 年 01* 6 ·如申請專利範圍第2項之方法,其中擷取活動影像資料 集包括將活動影像資料集配置為含列與行之影像像素陣 列。 7·如申請專利範圍第6項之方法,其中擷取活動影像資料 集更包括 辨識在各影像像素陣列行範圍内之最高像素值;以及 產生具一列之資料集,藉此各行已辨識之最高像素值 係表該行。 8 ·如申請專利範圍第1項之方法,其中將第一影像處理為 第一影像資料集以及將第二影像處理為第二影像資料集 包括對第一及第二影像之篩選。 9·如申請專利範圍第8項之方法,其中該篩選包括在第一 及弟二影像邊緣之擴取。 10·如申請專利範圍第8項之方法,其中該篩選更包括對第 一影像資料集以及第二影像資料集之處理,俾強調第一 影像資料集與背景資料集之差異,以及強調第二影像資 料集與背景資料集之差異。 11·如申請專利範圍第1項之方法,其中對第一影像資料集 以及第二影像資料集之處理產生背景資料集,包括產生 第一組一或多組與第一影像資料集有關之背景資料集, 以及產生第二組一或多組與第二影像資料集有關之背景 資料集。 $ 12.如申請專利範圍第1 1項之方法,其中產生第_組一或夕 組背景資料集,包括在代表背景之第一影像資料集^ -2- 、申請專利範圍 ^產生代表最大資料值之第一背景組,以及產生第二 ::或多組背景資料集,彳包括在代表背景之第二影像 具料集範圍内,產生代表最大資料值之第二背景组。 3.如申請專利範圍第12項之方法,其中該產生更包括對代 f表ϋ之資料最大值之第一及第景集,以一預 疋值增加在第一及第二背景集範圍内之值。 14·如中請專利範圍第&quot;項之方法,其中產生第一組一或多 、且月⑦貝料集,包括在代表背景之第一影像資料集範圍 内,產生代表最小資料值之第一背景組,以及產生第二 組-或多組背景資料集,可包括在代表背景之第二影像 貪料集範圍内,產生代表最小資料值之第二背景組。 _叫專利範圍第14項之方法,其中該產生更包括對代 二表不背景之資料最小值之第一及第二背景集,以一預 疋值增加在第一及第二背景集範圍内之值。 16·如中請專利範圍第&quot;項之方法,其中產生第一組背景資 2集,包括取樣第一影像資料I,以及產生第二組背景 貝料集包括取樣第二影像資料集。 Π.如:請專利範圍第16項之方法,其中產生第一組—或多 ..且U員料集’可包括在保持在各背景資料集範圍内之 =-影像資料集之多重取樣,以及產生第:組—或多組 冃本貝料集可包括在保持在各背景資料集範圍内之第二 影像資料集之多重取樣。 18•如t請專利範圍第17項之方法,其中產生各第一背景資 料集,可包括自多重取樣中選擇代表在第一影像資料 午 h ΗNo. 090124363 Patent Application Chinese Patent Application Replacement (April 1992), Patent Application Scope 1. A method for tracking a target, the method includes: obtaining a first image and a second image representing different views of the target · Process the first image into the first image data set, and process the 2nd 7'th image into the second image data set; process the first image data set and the second image data set to generate background-related background data 〃 Generate a first difference map by determining the difference between the first image data set and the background data set, and determine a second difference map by determining the difference between the second image data set and the background data set; Measure the first relative position of the target in the first difference map and the second relative position of the target in the second difference map; and the absolute positions of the first and second relative positions from the target that are generated from the target. 2. The method according to item 1 of the scope of patent application, wherein the steps of processing the first image into a first image data set and processing the second image into a second image data set include determining a moving image area of the first and second images , And the first and second images in the moving image area to capture the moving image data set. 3. The method according to item 2 of the patent application range, wherein capturing the moving image data set includes cropping the first and second images. 4. The method according to item 2 of the patent application range, wherein capturing the moving image data set includes rotating the first and second images. 5 · The method according to item 2 of the scope of patent application, wherein capturing the moving image data set includes trimming the first and second images. This paper size applies to China National Standard (CNS) A4 (210 X 297 mm) 543323 Patent application scope AB c D 92, year 01 * 6 · If the method of the second patent application method is applied, which captures moving image data The set includes a moving image data set configured as an image pixel array with columns and rows. 7. The method according to item 6 of the scope of patent application, wherein capturing the moving image data set further includes identifying the highest pixel value in the range of each image pixel array line; and generating a data set with a row so that the highest value identified by each line The pixel values refer to this row. 8. The method of claim 1 in claim 1, wherein processing the first image into a first image data set and processing the second image into a second image data set includes filtering the first and second images. 9. The method according to item 8 of the scope of patent application, wherein the screening includes expanding at the edges of the first and second images. 10. The method according to item 8 of the scope of patent application, wherein the screening further includes processing of the first image data set and the second image data set, emphasizing the difference between the first image data set and the background data set, and emphasizing the second Differences between image data sets and background data sets. 11. The method of claim 1 in the scope of patent application, wherein the processing of the first image data set and the second image data set generates a background data set, including generating a first set of one or more sets of backgrounds related to the first image data set A data set, and generating a second set of one or more sets of background data sets related to the second image data set. $ 12. The method according to item 11 of the scope of patent application, in which the background data set of group _ or group xi is generated, including the first image data set representing the background ^ -2-, the scope of patent application ^ produces the largest representative data The first background set of values and a second ::: or multiple sets of background data set are included in the range of the second image material set representing the background to generate a second background set representing the largest data value. 3. The method according to item 12 of the scope of patent application, wherein the generation further includes the first and second scene sets of the maximum value of the data representing f, and is increased by a pre-value within the range of the first and second background sets. Value. 14. The method of item No. in the patent claim, wherein a first set of one or more and a set of moon shellfish materials are generated, including within the range of the first image data set representing the background, and a number representing the smallest data value is generated. A background set and generating a second set or sets of background data sets may be included within the range of the second image data set representing the background to generate a second background set representing the smallest data value. The method called item 14 of the patent scope, wherein the generation further includes the first and second background sets of the minimum value of the background data of the second generation, which are increased within a range of the first and second background sets by a predetermined threshold. Value. 16. The method according to the Chinese Patent Application, wherein a first set of background data is generated, including sampling of the first image data I, and a second set of background data collection includes sampling of the second image data set. Π. For example, the method of item 16 of the patent scope is requested, in which the first group—or more .. and the U member data set 'may include multiple sampling of image data sets maintained within the scope of each background data set, And generating the first group—or multiple groups of transcript collections may include multiple sampling of the second image data set maintained within the range of each background data set. 18 • If the method of item 17 of the patent scope is requested, wherein each first background data set is generated, it may include selecting from multiple sampling to represent the first image data at noon h Η 、申請專利範 範圍内之各單 匕旦、。口 料集可包括自心… 以及產生各第二背景資 圍&amp;、*自夕重取樣中選擇代表在第二影像資料集範 19 各單元背景之單一值。 範圍第18項之方法,其中該選擇包括自各背 ?ηΓ 集中所有的取樣值中選擇中間值。 :::專利範圍第16項之方法,其中該取樣係於預定時 21 4由_動焱生,其中各取樣可包括與背景無關之資料。 •二I1!專利範圍第1項之方法,其中該產生包括對第-二f枓集與背景資料集之次集之比較,以及對第二影 貝料集與背景資料集之次集之比較。 22·如申請專利範圍第1項之方法,其中產生第-差異圖, ^包括將在第一影像資料集中各單元表為兩種狀態之 #以及產生第二差異圖更包括將在第二影像資料集中 各早凡表為兩種狀態之一,其中的兩種狀態係表示該值 是否與該背景相容。 •如申μ專利範圍第!項之方法,其中該偵測包括對在第 二及第二差異κ中叢集之辨識,其中各叢集均具本身狀 您、在其相關差異圖範圍内之單元,指示與背景不相容之 單元。 24. 如令請專利範圍第23項之方法,纟中辨識叢集更包括將 差異圖降至一列,此係藉由計算在與背景不相容之行範 圍内之單元為之。 25. 如申請專利範圍第24項之方法,其中辨識叢集更包括辨 識行在叢集範圍内,以及將鄰近行歸類在叢集内。 -4 本紙張尺度適用中國國家標準(CNS) Α4規格(210 X 297公 543323Each patent within the scope of patent application. The data set can include self-consciousness ... and generate a single value for each background in the second image data set. The method of range item 18, wherein the selection includes selecting an intermediate value from all the sampled values in each background? ΗΓ set. ::: The method of item 16 of the patent scope, in which the sampling is performed at the scheduled time 21 4 by __, and each sampling may include background-independent data. • The method of item 1 of the scope of the two I1! Patents, wherein the generation includes a comparison of the second set of the second-f set and the background data set, and a comparison of the second set of the second shadow set and the background data set . 22. The method according to item 1 of the scope of patent application, in which a -difference map is generated, which includes listing each unit in the first image data set as two states # and generating a second difference map further includes Each table in the dataset is one of two states, and the two states indicate whether the value is compatible with the background. • If applied for μ patent scope! Method, wherein the detection includes identification of clusters in the second and second differences κ, where each cluster has a unit that looks like you and is within the scope of its related difference map, indicating a unit that is incompatible with the background . 24. If the method according to item 23 of the patent is requested, the identification cluster in the middle includes reducing the difference map to a row, which is calculated by calculating the units in the row range that is incompatible with the background. 25. The method of claim 24, wherein identifying the cluster further includes identifying the rows within the cluster, and categorizing neighboring rows within the cluster. -4 This paper size applies to China National Standard (CNS) Α4 size (210 X 297 male 543323 6·如申請專利範圍第25項之方法,其中 内亦包括辨識中間行。 、“仃在叢集範圍 27. 如申請專利範圍第23項之方法, 識與叢集有關之位置。 〃 Μ叢集更包括辨 28. 如申請專利範圍第π項之方法,兑 位罾~紅、- /、中辦識與叢集有關之 置包括計具在叢集内單元之加權平均數。 29·如申請專利範圍第23項之方法, 集為目標物。 ”中偵測更包括歸類叢 3〇·如申請專利範圍第29項之方 f ~類叢集更包括計 「在叢集内(單元,而僅在計數超出預定臨界 歸類叢集為目標物。 31.:申請專利範圍第29項之方法,其中歸類叢集更包括計 异在叢集内之單元,以及計算經歸類為與差異圖内背景 不相客之單元總數,而僅有在叢集内之單元數對單元總 數比超出預定臨界值時,始歸類叢集為目標物。 2·如申叫專利範圍第23項之方法,其中偵測步驟更包括對 在叢集内之次叢集之辨識,其係表示目標物之指標端, 以及對目標物位置之辨識。 其中目標物係使用者的 33·如申請專利範圍第1項之方法 〇 34.如申請專利範圍第1項之方法,更包括利用目標物之絕 對位置來控制應用。 35·如申請專利範圍第1項之方法,更包括: 取得代表不同觀點之目標物之第三影像及第四影像; 5- 本紙張尺度適用中國國家標準(CNS) Α4規格(210X297公釐)6. If the method of applying for the scope of the patent No. 25, which also includes identifying the middle line. "" In the cluster scope 27. If the method of applying for the scope of the patent No. 23, to identify the position related to the cluster. Μ M clusters also include identification 28. If the method of applying for the scope of the patent No. π, the position 罾 ~ red,- / 、 The information related to the cluster including the weighted average number of units in the cluster. 29. If the method in the scope of the patent application No. 23, the set is the target object. The detection in "" also includes clustering 30. · If the category f ~ category cluster in the 29th scope of the patent application includes the "inside the cluster (unit, and only the cluster is counted beyond the predetermined critical classification cluster as the target. 31 .: The method of the 29th scope of the patent application, The clustering also includes counting the cells that are different in the cluster and calculating the total number of cells that have been classified as different from the background in the difference map. Only when the ratio of the number of cells in the cluster to the total number of cells exceeds a predetermined threshold The cluster is classified as the target object. 2. The method of claim 23 of the patent scope, wherein the detection step further includes the identification of the secondary cluster within the cluster, which is the indicator end of the target object and the target. Physical location Identification. Among them, the target is the user's 33. If the method of applying for the scope of the first item of the patent is applied. 34. As for the method of applying for the scope of the first item of the patent, it also includes using the absolute position of the target to control the application. The method of the first item of the scope further includes: obtaining the third image and the fourth image of the target object representing different viewpoints; 5- This paper size applies the Chinese National Standard (CNS) A4 specification (210X297 mm) 六、申請專利範圍 將第三影像處理為第三影 _ 理為第四影像資料集; 术,並將第四影像處 對第三影像資料集與第四 背景有關之背景資料集;像貝科k處理’產生與 由弟二影像資料集虚昔, 三差異圖,以及由第四料㈣之差異決定產生第 異決定產生第四差異圖“料集與背景資料集間之差 對在第三差異圖中目標物之第三相對位置,以及在第 四差”中目!物之第四相對位置之悄測;以及 ^目払物(罘一、第二、第三及第四相對位 標物之絕對位置。 &amp; &amp; w 3“广青專利範圍第35項之方法,其中目標物係使用者的 37. 如申清專利範圍第3 5項士女、、土 a, -又万法,更包括利用目標物之* 對位置來控制應用。 S 38. —種利用使用者做為盥雪 、 巧/、包知又介面來控制追縱目標物之 方法,該方法包括: 自至少兩觀點取得影像; 處理所取得之影像,產生各取得影像之影像資料集; 將各影像資料集與一或多組背景資料集做比較,產生 對各取得影像之差異圖; 對在各差異圖中目標物之相對位置偵測; 自目標物之相對位置產生目標物之絕對位置;以及 利用絕對位置使得使用者可與電腦應用互動。 543323 B8 ------—___ C8 ; '二 '一——^--^— 39·如申請專利範圍第38項之方法,更包括: 炉將目‘物〈絕對位置映至與電腦應用有關之螢幕座 ;以及 /利用映對位置做為與電腦應用之介面。 4〇Ή專利範圍第38項之方法,更包括: 藉由對目標物絕對位置變化之分析來組織與目標物有 關之手勢;以及 將繞對位置及手勢合併成為與電腦應用之介面。 41·種與在電腦上執行之應用程式有所介接之多重攝影機 追蹤系統,該追蹤系統包括: 一 一或多個影像攝影機,其配置使其可提供所欲之各種 觀點,並可操作產生一系列的視訊影像; 處理w其可操作接收該系列視訊影像,並偵測在 目標區域中出現之物體,該處理器可執行下列處理: 自視訊影像產生背景資料集; 對各接收之視訊影像產生影像資料集; 比較各影像資料集與背景資料集,產生對各影 料集之差異圖; Λ 偵測在各差異圖中目標物之相對位置;以及 自目標物之相對位置產生目標物之絕對位置,並將 絕對位置映至與應用相連之位置指標。 42·如申請專利範圍第41項之多重攝影機追蹤系統,其中目 標物可為人手。 43 ·如申請專利範圍第41項之多重攝影機追蹤系統,其中目 本紙張尺度適用中國國家標準(CNS) Α4規格(210 X 297公釐) 543323 D2. 4. 1 5Sixth, the scope of the patent application is to process the third image into the third image_ the fourth image data set; and the fourth image department to the third image data set related to the fourth background background data set; like Beco k processing 'produces the difference between the past and the second image data set, three difference maps, and the fourth difference map. The fourth difference map is generated. "The difference between the data set and the background data set The third relative position of the target in the difference map, and the target in the fourth difference "! The quiet position of the fourth relative position of the object; and the absolute position of the target object (the first, second, third, and fourth relative position objects. &Amp; &amp; w 3 Method, in which the target object is 37. For example, the application of the 35th scholar, woman, and soil a in the scope of patent application, including the use of the target object to control the application. S 38. —kind A method for controlling the pursuit of a target by using a user as a toilet, a smart phone, an intelligible interface, and an interface, the method includes: acquiring images from at least two viewpoints; processing the acquired images to generate an image data set of each acquired image; Compare each image data set with one or more sets of background data sets to generate a difference map for each acquired image; detect the relative position of the target in each difference map; generate the absolute position of the target from the relative position of the target Position; and use absolute position to allow users to interact with computer applications. 543323 B8 ------—___ C8; 'two' one — ^-^ — 39 · If the method of the 38th scope of the patent application, Including: Furnace head 'things <Absolute Set the screen to the screen base related to the computer application; and // Use the mapping position as the interface to the computer application. The method of item 38 of the patent scope includes: by analyzing the change of the absolute position of the target Organizing gestures related to the target; and merging orbiting positions and gestures into an interface with a computer application. 41. A multiple camera tracking system that interfaces with applications running on a computer. The tracking system includes: One or more video cameras configured to provide a variety of viewpoints as desired, and can be operated to generate a series of video images; processing, it can operate to receive the series of video images, and detect objects appearing in the target area The processor can perform the following processing: generate background data sets from video images; generate image data sets for each received video image; compare each image data set with background data sets to generate a difference map for each video set; Λ detection Measure the relative position of the target in each difference map; and generate the absolute position of the target from the relative position of the target, and The absolute position maps to the position indicator connected to the application. 42. For example, the multiple camera tracking system for item 41 of the patent application, where the target can be a human hand. 43 · For the multiple camera tracking system for item 41 of the patent application, where the target This paper size applies to China National Standard (CNS) A4 (210 X 297 mm) 543323 D2. 4. 1 5 標區係定為與電腦連結之攝影顯示前方,以及其中該處 理态更可操作將目標物之絕對位置映至位置指標,使得 攝影顯示上的位置指標位置對準目標物。 44·如申請專利範圍第41項之多重攝影機追蹤系統,其中目 t區4;走在與電腦連結之攝影顯示前方任何距離處,以 及其中該處理器更可操作將目標物之絕對位置映至位置 指標,使得攝影顯示上的位置指標位置對準由目標物所 指位置。 45. 如申請專利範圍第4丨項之多重攝影機追蹤系統,其中目 標區可定在與電腦連結之攝影顯示前方任何距離處,以 及其中該處理器可操作將目標物之絕對位置映至位置指 標’使得目標物之移動尺度與在攝影顯示上的位置指標 之車大位移一致。 46. 如申請專利範圍第4 1項之多重攝影機追蹤系統,其中該 處理咨更可配置成模擬一電腦滑鼠功能。 47. 如申凊專利範圍第4 1項之多重攝影機追縱系統,其中該 處理器更可配置成利用自目標物之移動推估之手勢而模 擬一電腦滑鼠之控制按鍵。 48·如申請專利範圍第4 1項之多重攝影機追蹤系統,其中對 一預定期間内可容忍之目標物位置可由應用程式内之選 擇動作觸發之。 49.如申請專利範圍第41項之多重攝影機追蹤系統,其中處 理器更可配置成基於對一預定期間内玎容忍之目標物位 置而與電腦滑鼠控制按键相仿。 -8 - 本紙張尺度適用中國國家標準(CNS) a4規格(21〇x 297公釐) 543323 A8 B8 ____ C8 一 --------- D8 六、申請專利範圍 5〇.如申請專利範圍第41項之多重攝影機追縱系统,其中對 -預定期間内,在交互作用顯示區邊界内之可容忍目標 物位且可由應用程式内之選擇動作觸發之。 51·如令請專利範圍第41項之多重攝影機追蹤系統,其中該 處理器可配置成基於對-預定期間内,在交互作用顯示 區邊界内之位置指標可容忍位置而與電腦滑鼠控制按键 相仿。 辽:申請專利範圍第41項之多重攝影機追蹤系統,其中該 背景資料集包括以至少部分靜態結構表示之資料點。 53·如申請專利範圍第52項之多重攝影機追蹤系統,其中至 少部分靜態結構包括一攝影機可見之圖樣化表面。 54. 如申請專利範圍第52項之多重攝影機追蹤系統,其中該 靜態結構係一窗形框架。 55. 如申請專利範圍第52項之多重攝影機追蹤系統,其中該 靜態結構包括一串光線。 56. —種與在電腦上執行之應用程式有所連結之多重攝影機 追縱系統,該追蹤系統包括: 二或多個影像攝影機,其配置使其可提供所欲之各種 觀點,並可操作產生一系列的視訊影像; 一處理器,其可操作接收該系列視訊影像,並偵測在 目標區域中出現之物體,該處理器可執行下列處理: 自視訊影像產生背景資料集; 對各接收之視訊影像產生影像資料集; 比較各影像資料集與背景資料集,產生對各影像資 -9- 本紙張尺度適财國國家標準(⑽)A4規格(2igχ撕公董) 543323 六、申請專利範圍The target area is set in front of the photographic display connected to the computer, and the processing state is further operable to map the absolute position of the target to the position indicator, so that the position indicator position on the photographic display is aligned with the target. 44. If the multi-camera tracking system of item 41 of the patent application scope, where the target area is 4; walking at any distance in front of the photographic display connected to the computer, and where the processor is further operable to map the absolute position of the target to The position index is such that the position of the position index on the photographic display is aligned with the position pointed by the target. 45. For example, the multiple camera tracking system for patent application No. 4 丨, wherein the target area can be set at any distance in front of the photographic display connected to the computer, and the processor is operable to map the absolute position of the target to the position indicator 'Make the moving scale of the target consistent with the large displacement of the position indicator on the photographic display. 46. For example, the multiple camera tracking system of item 41 of the patent application scope, wherein the processing unit can be configured to simulate a computer mouse function. 47. For example, the multiple camera tracking system of item 41 of the patent application scope, wherein the processor can be further configured to simulate a computer mouse control button using gestures estimated from the movement of the target. 48. The multiple camera tracking system of item 41 in the scope of patent application, in which the position of a tolerable target within a predetermined period can be triggered by a selection action in the application. 49. The multi-camera tracking system according to item 41 of the patent application, wherein the processor may be further configured to resemble a computer mouse control button based on a target position tolerated within a predetermined period. -8-This paper size applies Chinese National Standard (CNS) a4 specification (21〇x 297mm) 543323 A8 B8 ____ C8 I --------- D8 VI. Application scope of patent 50. If applying for patent The multiple camera tracking system of the range item 41, wherein within a predetermined period, the tolerable target level within the boundary of the interaction display area can be triggered by a selection action in the application. 51. If requested, the multi-camera tracking system of item 41 of the patent scope, wherein the processor can be configured to control the keys with the computer mouse based on the position tolerance within the interactive display area boundary within a predetermined period of time. similar. Liaoning: The multiple camera tracking system for patent application No. 41, wherein the background data set includes data points represented by at least part of a static structure. 53. The multiple camera tracking system of claim 52, wherein at least part of the static structure includes a patterned surface visible to the camera. 54. The multiple camera tracking system of claim 52, wherein the static structure is a window frame. 55. The multiple camera tracking system of claim 52, wherein the static structure includes a string of light. 56. A multi-camera tracking system linked to an application running on a computer, the tracking system includes: two or more video cameras configured to provide various views as desired, and can be operated to generate A series of video images; a processor operable to receive the series of video images and detect objects appearing in the target area, the processor may perform the following processing: generate a background data set from the video images; Generate video data sets for video images; compare each video data set with background data sets, and generate data for each image. 9- This paper is a national standard (⑽) A4 specification (2igχ tearing board) 543323 6. Scope of patent application 料集之差異圖; 七、在各差異圖中目標物之相對位置; 2目標物之相對位置產生目標物之絕對位置; 疋出目標區内之次區域; 7識為目標物佔據之次區域; 忌目標物佔據所辨識之次區域時,即啟動與所 之次區域有關的動作;以及 口0 將該動作與應用程式交互作用。 57.如中請專利範圍第56項之多重攝影機追蹤系統 標物可為人手。 八γ in利範圍第56項之多重攝影機追縱系統,其中 動有關的動作可模擬和應用程式連結 59·如申請專利範圍第56項之多重攝影機追蹤系統, -預定期間内,在任何次區域内之可 桿: 由動作觸發之。 ^物位置 目 與 之 對 可 -10- 本纸張尺度適用中國國家標準(CNS) A4規格(210X 297公釐)The difference map of the material set; 7. The relative position of the target in each difference map; 2 The relative position of the target produces the absolute position of the target; Identify the sub-region in the target area; 7 Identify the sub-region occupied by the target ; When the target object occupies the identified sub-area, the action related to the sub-area is started; and mouth 0 interacts this action with the application. 57. The multi-camera tracking system of item 56 of the patent scope can be a human hand. The multi-camera tracking system of item 56 in the eighth scope, which can be simulated and linked with the application. 59. For example, the multi-camera tracking system of item 56 in the patent application range,-in any sub-region within a predetermined period Inner stick: triggered by action. ^ Location of object and opposite -10- This paper size applies Chinese National Standard (CNS) A4 (210X 297 mm)
TW90124363A 2000-10-03 2001-10-02 Multiple camera control system TW543323B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US23718700P 2000-10-03 2000-10-03

Publications (1)

Publication Number Publication Date
TW543323B true TW543323B (en) 2003-07-21

Family

ID=29735553

Family Applications (1)

Application Number Title Priority Date Filing Date
TW90124363A TW543323B (en) 2000-10-03 2001-10-02 Multiple camera control system

Country Status (1)

Country Link
TW (1) TW543323B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102542160A (en) * 2010-12-21 2012-07-04 微软公司 Skeletal control of three-dimensional virtual world
TWI383680B (en) * 2008-04-10 2013-01-21 Univ Nat Chiao Tung Integrated image surveillance system and manufacturing method thereof
CN103198290A (en) * 2012-01-10 2013-07-10 冯振 Method for detecting number, positions and moving of human bodies through video
TWI420906B (en) * 2010-10-13 2013-12-21 Ind Tech Res Inst Tracking system and method for regions of interest and computer program product thereof
US9262015B2 (en) 2010-06-28 2016-02-16 Intel Corporation System for portable tangible interaction
US9971491B2 (en) 2014-01-09 2018-05-15 Microsoft Technology Licensing, Llc Gesture library for natural user input
US10033933B2 (en) 2015-04-07 2018-07-24 Synology Incorporated Method for controlling surveillance system with aid of automatically generated patrol routes, and associated apparatus
TWI716708B (en) * 2017-07-18 2021-01-21 大陸商杭州他若信息科技有限公司 Intelligent object tracking
TWI775135B (en) * 2019-08-28 2022-08-21 大陸商北京市商湯科技開發有限公司 Interaction method, apparatus, device and storage medium

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI383680B (en) * 2008-04-10 2013-01-21 Univ Nat Chiao Tung Integrated image surveillance system and manufacturing method thereof
US9262015B2 (en) 2010-06-28 2016-02-16 Intel Corporation System for portable tangible interaction
TWI550437B (en) * 2010-06-28 2016-09-21 英特爾公司 Apparatus capable of tangible interaction, article of manufacture, and method for tangible interaction
TWI420906B (en) * 2010-10-13 2013-12-21 Ind Tech Res Inst Tracking system and method for regions of interest and computer program product thereof
US8994718B2 (en) 2010-12-21 2015-03-31 Microsoft Technology Licensing, Llc Skeletal control of three-dimensional virtual world
CN102542160B (en) * 2010-12-21 2015-10-28 微软技术许可有限责任公司 The skeleton of three-dimensional virtual world controls
CN102542160A (en) * 2010-12-21 2012-07-04 微软公司 Skeletal control of three-dimensional virtual world
US9489053B2 (en) 2010-12-21 2016-11-08 Microsoft Technology Licensing, Llc Skeletal control of three-dimensional virtual world
CN103198290A (en) * 2012-01-10 2013-07-10 冯振 Method for detecting number, positions and moving of human bodies through video
US9971491B2 (en) 2014-01-09 2018-05-15 Microsoft Technology Licensing, Llc Gesture library for natural user input
US10033933B2 (en) 2015-04-07 2018-07-24 Synology Incorporated Method for controlling surveillance system with aid of automatically generated patrol routes, and associated apparatus
TWI716708B (en) * 2017-07-18 2021-01-21 大陸商杭州他若信息科技有限公司 Intelligent object tracking
TWI775135B (en) * 2019-08-28 2022-08-21 大陸商北京市商湯科技開發有限公司 Interaction method, apparatus, device and storage medium

Similar Documents

Publication Publication Date Title
CA2424673C (en) Multiple camera control system
US11238644B2 (en) Image processing method and apparatus, storage medium, and computer device
AU2001294970B2 (en) Object tracking system using multiple cameras
WO2022022036A1 (en) Display method, apparatus and device, storage medium, and computer program
Kollorz et al. Gesture recognition with a time-of-flight camera
US7274803B1 (en) Method and system for detecting conscious hand movement patterns and computer-generated visual feedback for facilitating human-computer interaction
US7944454B2 (en) System and method for user monitoring interface of 3-D video streams from multiple cameras
JP4965653B2 (en) Virtual controller for visual display
US8854433B1 (en) Method and system enabling natural user interface gestures with an electronic system
US6775014B2 (en) System and method for determining the location of a target in a room or small area
JP5453246B2 (en) Camera-based user input for compact devices
US20140248950A1 (en) System and method of interaction for mobile devices
AU2001294970A1 (en) Object tracking system using multiple cameras
WO2022022029A1 (en) Virtual display method, apparatus and device, and computer readable storage medium
TW543323B (en) Multiple camera control system
JP7293362B2 (en) Imaging method, device, electronic equipment and storage medium
WO2023030176A1 (en) Video processing method and apparatus, computer-readable storage medium, and computer device
Molyneaux et al. Cooperative augmentation of mobile smart objects with projected displays
TW561423B (en) Video-based image control system
Malerczyk Dynamic Gestural Interaction with Immersive Environments
Rhee et al. Combining pointing gestures with video avatars for remote collaboration
JP2004227073A (en) Information selection method and information selection program

Legal Events

Date Code Title Description
GD4A Issue of patent certificate for granted invention patent
MK4A Expiration of patent term of an invention patent