TW201108005A - Video search method and apparatus using motion vectors - Google Patents
Video search method and apparatus using motion vectors Download PDFInfo
- Publication number
- TW201108005A TW201108005A TW99113963A TW99113963A TW201108005A TW 201108005 A TW201108005 A TW 201108005A TW 99113963 A TW99113963 A TW 99113963A TW 99113963 A TW99113963 A TW 99113963A TW 201108005 A TW201108005 A TW 201108005A
- Authority
- TW
- Taiwan
- Prior art keywords
- video
- search
- motion vector
- map
- file
- Prior art date
Links
- 239000013598 vector Substances 0.000 title claims abstract description 156
- 230000033001 locomotion Effects 0.000 title claims abstract description 119
- 238000000034 method Methods 0.000 title claims abstract description 42
- 230000000875 corresponding effect Effects 0.000 claims description 20
- 238000005070 sampling Methods 0.000 claims description 13
- 230000008859 change Effects 0.000 claims description 9
- 238000006243 chemical reaction Methods 0.000 claims description 7
- 239000011159 matrix material Substances 0.000 claims description 4
- 230000000007 visual effect Effects 0.000 claims description 3
- 230000002596 correlated effect Effects 0.000 claims description 2
- 230000009467 reduction Effects 0.000 claims description 2
- 102000004190 Enzymes Human genes 0.000 claims 1
- 108090000790 Enzymes Proteins 0.000 claims 1
- 238000005206 flow analysis Methods 0.000 claims 1
- 230000008569 process Effects 0.000 abstract description 3
- 238000010586 diagram Methods 0.000 description 12
- 238000004364 calculation method Methods 0.000 description 10
- 238000005516 engineering process Methods 0.000 description 7
- 230000006870 function Effects 0.000 description 5
- 239000000463 material Substances 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000010187 selection method Methods 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 206010011469 Crying Diseases 0.000 description 1
- 241000255925 Diptera Species 0.000 description 1
- 241001050985 Disco Species 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 230000009194 climbing Effects 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 235000009508 confectionery Nutrition 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 235000021438 curry Nutrition 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 230000033764 rhythmic process Effects 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 238000003307 slaughter Methods 0.000 description 1
- 210000002784 stomach Anatomy 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 239000002023 wood Substances 0.000 description 1
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
201108005201108005
P52980146TW 33280twf.doc/I 六、發明說明: 【發明所屬之技術領域】 方法及其裴置,且特別 輪入條件來進行視訊搜 本發明是有關於一種視訊搜尋 是有關於一種以視訊内容作為搜尋 尋的搜尋方法及其裝置。 【先前技術】P52980146TW 33280twf.doc/I VI. Description of the invention: [Technical field of the invention] Method and its device, and special wheeling conditions for video search The invention relates to a video search for a video content search Search method and device for finding. [Prior Art]
—目前網路上的搜尋目前網路上的搜尋技術,全都以文 字(Text)為主的技術,例如Go〇gle、Yah〇〇、曰 國内的無名小站等等搜尋引擎,都是以文字搜尋為主= 然各搜尋引擎都希望能突破文字的限f ^,例如 打,可搜料針域是簡體中文、甚至其他語言的内 谷,但畢竟逛是受到文字關鍵字的限制。例如,若希望能 搜尋相關❹媒體㈣,包括聲音齡歧影㈣案1 2因為沒有足夠的文字内容可作為搜尋的依據,或是各國 豕抓用不同的翻譯名稱,造成徒有關鍵字卻無法找出正確 或者更多相關的資料。 搜咢引擎Google在2009年4月份推出以照片找照片 的服務,是全世界第一個以照片内容(c〇ntent)找具有^關 内容(Content)資料的服務。例如,請參照圖1A,當在輸入 框120輸入“aPPle”關鍵字11〇時,會出現跟“apple”有關的 相關資料,但此時包括以蘋果造型所有的影像,以及商標 “Apple®”的相關產品手機“iPh〇ne,,。此時,若再進一步^ 選時,就可以排除許多不適用資料。例如圖1B,使用者選 取與蘋果造型相關的影像後,搜尋引擎進一步顯示與此類 201108005- At present, the search technology on the Internet is currently based on text-based technologies, such as Go〇gle, Yah〇〇, the nameless station in the country, and so on. Main = However, each search engine wants to break through the limit of text f ^, for example, hit, the search pin field is Simplified Chinese, and even the inner valley of other languages, but after all, shopping is limited by text keywords. For example, if you want to search for relevant media (4), including the sound age (4) case 1 2 because there is not enough text content to be used as a basis for search, or countries to use different translation names, resulting in Find out the right or more relevant information. Google search engine launched a photo-based photo service in April 2009. It is the world's first service to find content with photo content (c〇ntent). For example, referring to FIG. 1A, when the "aPPle" keyword 11 is input in the input box 120, related information related to "apple" appears, but this includes all images in the shape of Apple, and the trademark "Apple®". Related products mobile phone "iPh〇ne,,. At this time, if you further select, you can eliminate a lot of unsuitable data. For example, Figure 1B, after the user selects the image related to Apple's shape, the search engine further displays this Class 201108005
P52980146TW 33280twf.doc/I 水果(蘋果)相關的影像。而又例如圖1C,使用者選取與商 標Apple®”相關產品手機,’影像後,則顯示與此產 品相關的其他影像,更精確地找出使用者所要搜尋的照 片。但此技術很顯然是用影像(Image)内容來搜尋影像 (Image),但僅限於相關照片圖檔的搜尋,對於多媒體擋 案’並沒有任何方法可以搜尋。 .為突破此限制,在活動影像專業人員組織(Motion Picture Expert Group ’以下簡稱mpEG)制定之技術協定 MPEP-7中’提出一種具有對内容提供補充資訊的標準, 特別是針對多媒體數位内容。在此MpEp_7標準中,對於 多媒體可以提供相關對應的内容描述(MulthnediaP52980146TW 33280twf.doc/I Fruit (Apple) related images. For example, in FIG. 1C, the user selects a mobile phone related to the trademark Apple®, and after the image, displays other images related to the product, and more accurately finds the photo that the user wants to search. However, this technology is obviously Use Image content to search for images, but only for the search of related photo files. There is no way to search for multimedia files. To break through this limitation, in Motion Picture Professionals (Motion Picture Expert Group 'hereinafter referred to as mpEG's technical agreement MPEP-7' proposes a standard with supplementary information for content, especially for multimedia digital content. In this MpEp_7 standard, relevant content descriptions can be provided for multimedia ( Multhnedia
ContentContent
Description),而且可以獨立於其他的MpEp標準,而此數 位内容描述甚至可以附加在類比的電影檔案中。如圖2所 示’對於母一個視聽内容(AU(ji〇-visuai Content,如圖式的 “AV Content”)’都可賦予對應的内容描述(c〇ntent Description) ’此内容描述主要是提出此視聽内容的相關特 徵值。而其槽案編排的方式,則是例如圖示的: AV+Descript+AV+Desript+AV+Desript+... 此AV即代表視聽内容(Audio-visual Content),而 Desnpt”則是代表對應的内容描述(Content Description)。 但這樣的架構過於複雜,必須對於所有的多媒體檔案 進行重新編排,不適合既有的檔案與架構。另外,雖然可 以怒L由特徵值’藉由類似關鍵字的搜尋,而找出相關的多 201108005Description), and can be independent of other MpEp standards, and this digital description can even be attached to an analog movie file. As shown in Figure 2, 'for a parent's audio-visual content (AU (ji〇-visuai Content, as shown in the figure "AV Content") can be given a corresponding content description (c〇ntent Description) 'This content description is mainly proposed The relevant feature value of this audiovisual content, and the way of slotting is as shown in the figure: AV+Descript+AV+Desript+AV+Desript+... This AV stands for Audio-visual Content. Desnpt stands for the corresponding Content Description. However, such a structure is too complicated, and all multimedia files must be rearranged, which is not suitable for existing files and architectures. In addition, although it can be angered by the feature value' Find similar multi-201108005 by searching with similar keywords.
P52980146TW 33280twf.d〇c/I 媒體檔案’但是,卻無法跳脫以文字搜尋造成不同語言之 間的隔閡。P52980146TW 33280twf.d〇c/I Media Archive 'But it is impossible to escape the gap between different languages by text search.
此外’隨著網路與TV的結合日漸普及,在電視上要 進行視訊搜尋將不可避免會碰到關鍵字輸入的問題。一般 人看電視手頭上所握的都是一個遙控器,以遙控的大小與 功能是無法取代鍵盤作為文字輸入的裝置,因此在網路TV 上利用遙控器來控制視訊搜尋是未來此類應用的一個問題 點所在。 【發明内容】 在一實施範例中,提出一種視訊搜尋方法,包括對一 輸入檢索視訊檔案與多個欲進行檢索的視訊檔案的位元串 流進行剖析,而取出對應的多個移動向量。根據所述移動 向量在時間軸上建立多個對應的移動向量地圖。根據所述 檢索視訊檔案的移動向量地圖與所述視訊檔案的移動向量 地圖,得到相關聯(Correlation)程度,藉以根據此相關聯程 度得到視訊搜尋的結果。 在一貫施範例中,提出一種視訊搜尋裝置,包括一串 流剖析益、一 3D移動向量地圖產生器與一 3D移動向量地 圖比較H °此串流剖析H用以對—個視訊檔案的位元串流 進行剖析(Parsing),並且根據不㈤比例取出位元流令的移 動向量(猜)資料。此3D移動向量地圖產生器用以 所述移動向量建構具有時_資料的—3D移動向量地 (3D-MV Map)。❼3D移動向量地圖比較器用以根據所述 3-D移動向量地圖進行運算,根據運算結果峨出視訊槽 201108005In addition, as the combination of the Internet and TV becomes more and more popular, video search on TV will inevitably encounter the problem of keyword input. The average person who watches TV is holding a remote control. The size and function of the remote control cannot replace the keyboard as a text input device. Therefore, using the remote control to control video search on the network TV is one of the future applications. The problem is where. SUMMARY OF THE INVENTION In one embodiment, a video search method is provided, including parsing a bit stream of an input search video file and a plurality of video files to be retrieved, and extracting a corresponding plurality of motion vectors. A plurality of corresponding motion vector maps are established on the time axis based on the motion vector. And according to the moving vector map of the search video file and the motion vector map of the video file, a Correlation degree is obtained, thereby obtaining a video search result according to the correlation degree. In a consistent example, a video search device is proposed, including a stream analysis, a 3D motion vector map generator and a 3D motion vector map. H ° This stream analysis H is used to locate bits of a video file. The stream is parsed (Parsing), and the motion vector (guess) data of the bit stream order is extracted according to the ratio of (5). The 3D motion vector map generator uses the motion vector to construct a 3D-MV Map with time_data. The ❼3D motion vector map comparator is configured to perform operations according to the 3-D motion vector map, and output a video slot according to the operation result.
P52980146TW 33280twf.doc/I 案的相關聯(COTelatkm)程度,藉以根據相關聯程度得到視 訊搜尋的結果。 在一實施範例中,提出一種視訊播放裝置,具有一主 機與-控制ϋ。此主機具有—種視訊搜尋裝置,此視訊裝 置包括一串流剖析器、一 3D移動向量地圖產生器盥一 3D 移動向量地圖比較ϋ。此控制器具有—功能控制。裝置,其 中功能控健龍以使用者選擇主機播放的視訊資料中的 其中一段,作為視訊搜尋裝置的一檢索視訊檔案。 一為讓本發明之上述特徵和優點能更听顯易懂,下文特 舉實施例,並配合所附圖式作詳細說明如下。 【實施方式】 ‘本,明實施範例提出一種新的視訊搜尋技術,突破目 月ί以文予為主的搜哥技術’建立—個以視訊内容作為搜尋 k件的技術,達到以視訊找視訊的搜尋結果。 在本發明實施範例其中之一,是選擇其中一段視訊槽 木作為檢索貢訊(Query Message),此選擇方式不論是透過 使用者選擇任-時間長度的視減案,或是經由使用者界 面自動選取其巾—段gj定或特定㈣的視訊齡等等方式 皆可運用在此範例中。 上述的使用者選擇方式,在一實施例中,可内建於控 ,影片播放的遙控n巾,例如電視或是DVD _放器等 '、或疋内建於觸控顯示器或螢幕上的使用者界面,能謅 使用者方便且簡單的方式停住—端時間的糾作為檢索實 訊即屬本發明之應用。 ’、、 201108005The degree of correlation (COTelatkm) of the P52980146TW 33280twf.doc/I case, in order to obtain the results of the video search based on the degree of association. In an embodiment, a video playback device is provided having a host and control unit. The host has a video search device, and the video device includes a stream parser, a 3D motion vector map generator, and a 3D motion vector map comparison. This controller has - function control. The device, wherein the function control jianlong selects one of the video data played by the host as a search video file of the video search device. The above features and advantages of the present invention will become more apparent from the description of the appended claims. [Embodiment] 'This, the implementation example proposes a new video search technology, which breaks through the goal of the text-based search technology, which is based on the video content as a technology for searching for k-pieces. Search results. In one of the embodiments of the present invention, one of the video slots is selected as a Query Message, which is selected by the user to select any time-length subtraction or automatically via the user interface. The method of selecting the towel-segment gj or the specific (4) video age can be used in this example. In the embodiment, the user selection method can be built in control, and the remote control n-film of the video playing, such as a television or a DVD player, or the use of the built-in touch display or the screen. The user interface can be stopped in a convenient and simple manner by the user - the correction of the end time is the application of the present invention. ’,, 201108005
P52980146Γ W 33280twf. doc/I 針對選擇作為射條料 影片格式,甚至晝面大小與 ,由於其名稱或是 一部影片,其劇情是相同的,J此有所不同,但若為同 Vector,MV)分布會相同或類似,I旦其動態向量(M〇ti〇n 後介紹。,所以,只要針對 圖7八或7B所示,稍 索引(Search Index),即可找出相士槽案,建立撿索 例如對作為檢索條件的視訊檔二=相似劇情的影片。 他視訊槽案,可統由袼式欲進行檢索的所有其 視訊稽案。由於本實_加^了時卩^^具有相同格式的 f,因此’可在所欲進行檢索的财上 相同時間軸區段的對應視訊槽索:他中找出 具有相祕^的視訊難。在轉換為 ,他視訊樓案可以是存在區;:=主== ^主5枓庫、或是雲端資料庫等等。而此格式轉換的二 個人主機、在區域網路的系統伺服器、搜尋引擎 的主機或是雲端運算系統等進行皆可。 支 在此實施範例中,轉換為相同格式的用意在於取得作 =索條件的視訊槽案,以及欲進行檢索的所有其他視訊 ,案的移動向量(Motion Vector,底下簡稱Mv)。也就是 5兒,對所有壓縮的視訊檔案的多個圖框取出其移動向量 (MV) ’據以建立檢索索引。在一實施例中,可採用串流剖 析器(Stream parser),對所有壓縮的視訊檔案的資料位元串 流進行剖析(Parsing),並且取出其移動向量(MV)。而對於 不同解析度的移動向量,在本實施例中,提出一種對移動 向量(MV)的統計方法,例如以影像群組(Gr〇up 〇f 201108005P52980146Γ W 33280twf. doc/I For the choice of film format as a shot, even the size of the face, due to its name or a movie, the plot is the same, J is different, but if it is the same as Vector, MV The distribution will be the same or similar, and its dynamic vector (M〇ti〇n will be introduced later. Therefore, as long as the index is shown in Figure 7 or 7B, the search index can be found. To establish a video, for example, a video file as a search condition = a similar plot. His video slot case can be used to retrieve all of its video files from the 袼 欲 。 。 。 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ The f of the same format, so 'the corresponding video slot of the same timeline section in the fortune to be retrieved: it is difficult to find the video with the secret ^. In the conversion, his video building can be present. District;:=main==^main5 library, or cloud database, etc. The two-person host of this format conversion, the system server of the local area network, the host of the search engine or the cloud computing system Yes, in this example, convert to the same The purpose of the formula is to obtain the video slot for the = condition, and all other video to be retrieved, the Motion Vector (Mv), which is 5, for all compressed video files. The frame takes its motion vector (MV)' to establish a retrieval index. In one embodiment, a stream parser can be used to parse the data bit stream of all compressed video files (Parsing). And take out its motion vector (MV). For the motion vectors of different resolutions, in this embodiment, a statistical method for the motion vector (MV) is proposed, for example, by image group (Gr〇up 〇f 201108005
P52980146TW 33280twf.doc/IP52980146TW 33280twf.doc/I
Piet薦’ GOP)為基本單位,針對其所有圖框的區塊(μ_ Block,MB) ’其移動向量所屬籠塊大小(BlGek s㈣所佔 的比率,以例如一臨限值(Thresh〇kJ)決定採用哪一種移動 向量(MV)進行建立檢索索引。 針對不同的視訊檔案,雖然其名稱或是影片格式,甚 至晝面大小與品質有所不同,若為同一部影片,其劇情是 ,同的’所以其動態向量(MV)分布會相同或類似,如圖7A 或7B所示,也就是在時間轴上圖框的變化不大。因此, 可以選擇性的以-定比例(例如1:2、1:4、1:N,N為整數) 取二寻對應晝框(Frame)的移動向量(MV)值。因此,所建立的 -D移動向量地圖可以達到一定精確程度即可。Piet recommends 'GOP' as the basic unit, for all its blocks (μ_Block, MB) 'the size of the cage block to which the motion vector belongs (BlGek s(4), for example, a threshold (Thresh〇kJ) Decide which mobile vector (MV) to use to create the search index. For different video files, although the name or video format, even the size and quality of the face are different, if the same film, the plot is the same 'So its dynamic vector (MV) distribution will be the same or similar, as shown in Figure 7A or 7B, that is, the frame does not change much on the time axis. Therefore, it can be selectively scaled (for example 1:2) , 1:4, 1:N, N is an integer) Take the motion vector (MV) value of the corresponding frame (Frame). Therefore, the established -D motion vector map can achieve a certain degree of precision.
私%只她靶例其中之一,根據所述3_D 尋:rr=。並且根 視職案(也歧檢索標_㈣移動向量地圖 檢^^方塊⑽—岭剛的罐值’與所有欲進行 (Mic、L麟案的3①移動向量地圖中所對應的微方塊 ,應)的_值進行差值運算,其分佈情況進 度,作输、==果彳㈣義(⑽知)的程 搜梅麵,❺ 在步酿。步驟3ig中’開始進行以視訊找視訊的操作。 '20 ’選擇其中—段視訊财作為檢索 W (Query 201108005Private % only one of her target cases, according to the 3_D search: rr=. And the root of the case (also known as the search mark _ (four) mobile vector map check ^ ^ box (10) - Ling Gang's can value' and all the desire to carry out (Mic, L Lin case 31 mobile vector map corresponding to the micro-square, should The value of _ is the difference operation, the distribution of the progress of the situation, for the loss, == fruit (four) meaning ((10) know the process of the search for the face, ❺ in the step. Step 3g 'start the video to find the video operation '20 'Select one of them - segment video as a search W (Query 201108005
P52980146TW 33280twf.doc/IP52980146TW 33280twf.doc/I
Message)。此選擇方式不論是透過使用者 體播放時,選擇其中—時間長度的視訊稽,播放軟 的影片中使用遙控器或其他方式停住—_( ^播放中 檢索資訊),或是經由—使用者界面,在使片作為 個功能按鈕後,自動選取1巾一,·’、埯其中一 是其他可=擇:中的視訊 皆在此範例可運用的範圍中。 。木5方式, 於其二如步驟咖’由 因此,例如對作為檢索條件的二所 ^訊樓案,可選擇性地經由格式轉換=:= 條件減步縣是錢簡絲式相同的 ,件下疋不需要進行。由於本實施例加 算轉』= = = ; = :經, 的视訊標案可以是存在區Ϊ網::I ; 或是雲端資料庫等等。而此m 引擎的主機或以算==系統健器、搜尋 的資對所有視訊_ ’通常為已經過壓縮 擇性的以iil^rsram)進行剖析(Parsing),可以選 時間軸之取;,、目的疋為了弹性調整 祅率。本貫施例將所有視訊檔案(包括作為檢索 [S 1 9 201108005Message). This selection method selects the video recorder of the length of time, whether it is played through the user's body, the remote control or other means of playing in the soft video -_(^Retrieving information during playback), or via the user The interface, after making the film as a function button, automatically selects 1 towel one, · ', one of which is other can be selected: the video in the range can be used in this example. . The wood 5 method, in the second step, as the step coffee, is thus, for example, the second case of the building as a search condition, optionally via format conversion =:= conditional reduction step county is the same as the money silk style, Kneeling does not need to be done. Since the addition of the embodiment of the present embodiment is changed to ===; =:, the video standard can be the presence of the network::I; or the cloud database and so on. The host of the m engine can be parsed (parsing) with all the video _ 'usually iil^rsram that has been compressed and selected, and the time axis can be selected; The purpose is to adjust the exchange rate flexibly. This video will cover all video files (including as a search [S 1 9 201108005]
P52980146TW 33280twf.doc/I 條件的視減案)轉換為相祕式_意,在於所有視 案内圖框的移動向量。也就是說,對所有㈣的視訊播= 取出其移動向量,據以建立檢索索引。 木 而對於不同解析度的移動向量,在本實施例令, 利用上調取樣(Up_Sampling)或是下調取 (D〇.Sampling)的方式進行改變。例如’ 一般視訊槽宰= ,很多依照時_順序連續排列的晝框㈣㈣所組成 每個晝框疋由很多個微區塊(Mien)BlGek,MB)編碼而 而每個微區塊MB為例如是16xl6騎位,而對於每 區塊MB具有的移動向量,有可能一個,也可能具有μ 個MB可再_成16塊4*4之子方塊),因此不同格 式的影片光-個MB中可能就有丨〜16個MV值在 f於後_雨差值運算會造成無法對應運算;所以為了 ^ ΐ度’則必須將每個微區塊廳所具有的移動向量 免—致’在—實施例中,若是要將η個移動向量 ^整^個’可以採用例如平均法,將η個移動向量的值 做一平均數的計算。 向旦ft若是要將僅有1個移動向量轉為具有n個移動 ^作法,可以將1個移動向量轉為讀相同值的移動 -徊ί外’針對如何蚊微區塊趣所具有的移動向量是 例個4是n個轉—個的方(,可採用統計法完成。 面群έ fMPEP^1fl編魏^,通常在架構上會定義書 Ξ=(^ΡΓ^Γ00Ρ),例^ 、灵動悲衫像h ’為達到比較好賴縮效果,會定義此 201108005P52980146TW 33280twf.doc/I conditional subtraction) is converted to a secret _ meaning, which is the motion vector of the frame within all the views. That is to say, for all (four) video broadcast = take out its motion vector, according to which the retrieval index is established. For the motion vectors of different resolutions, in the present embodiment, the change is performed by using up-sampling or down-sampling (D〇.Sampling). For example, 'general video slot slaughter=, many frames (four) (four) that are consecutively arranged according to the time_order are composed of a number of micro-blocks (Mien) BlGek, MB) and each micro-block MB is for example Is a 16xl6 riding position, and for each block MB has a motion vector, there may be one, or may have μ MB can be _ into 16 4 * 4 sub-blocks), so different formats of video light - MB may There are 丨~16 MV values in f after the _ rain difference calculation will cause the corresponding operation; therefore, in order to ^ ΐ degree, the mobile vector of each micro-block must be exempted from In the example, if the η moving vectors are to be integerd, the value of the n moving vectors can be calculated as an average. If you want to convert only one motion vector to have n movements, you can convert 1 motion vector to read the same value of movement - 徊ί outside' for how the mosquito micro-block has fun The vector is a case where 4 is n turns-- (can be done by statistical method. Face group έ fMPEP^1fl 编魏^, usually defines the book Ξ=(^ΡΓ^Γ00Ρ), example ^, Smart tragic shirt like h 'to achieve a better effect, will define this 201108005
P52980146TW 33280twf.doc/I 畫面群組(GOP),可以獲得影像資料内的隨機存取動作, 例如在MPEP-4協定是包括九張晝面(―個1畫面、兩個向 剷預測的P畫面與六個雙向預測的B晝面)。因此,若欲 知道採用那個移動向量數量比較適合,則在一例子中,可 以,面群組(GOP)為基本單位,針對其微區塊MB的移動 向量所屬的區塊大小(Block Size)所佔的比率,以例如一臨 限值(Threshold)來決定採用那個數量的移動向量進行 檢索索引。.P52980146TW 33280twf.doc/I Picture Group (GOP), you can obtain random access actions in the image data. For example, in the MPEP-4 agreement, it includes nine pictures (one picture, two P-pictures predicted by the shovel) B face with six bidirectional predictions). Therefore, if it is desired to know the number of mobile vectors to be used, in an example, a face group (GOP) may be a basic unit, and a block size (Block Size) to which the motion vector of the micro block MB belongs may be used. The ratio is taken, for example, by a threshold (Threshold) to determine the index of the search using that number of motion vectors. .
接著請參照步驟350 ’根據由移動向量(MV)所建構的 3D移動向量地圖(3D-MV Map),經過運算後即可得到關聯 值,而根據關連值排列即可找出相關視訊的檔案,並且= 照相關程度顯示其結果。 在選擇作為檢索條件的視訊樓案後,由於其名稱或是 影片格式,甚至晝面大小與品質有所不同,但若為同一部 影片,其劇情是相fi]的,所以其動態向量(MV)分布會相同 或類似,如圖7A或7B所示。所以,只要針對所選擇的韻 訊樓案,建立檢索索引(SearchIndex),即可找出相 相似劇情的影片。在實施範例其中之一,對於檢索索引 (Search Index)建立的方式,可以根據所取得的移動 (MV) ’存入-運算矩陣中’而得到一個沾移動向量地圖 (2D:MV Map)。而根據所選擇的視减案的關區間,你 如是30秒或是一分鐘,而根據連續顯示的晝框(Ram 生不同的2①移動向量地圖,此考慮時間軸的關係 建立一個稱為3-D移動向量地圖(3D_MV Map)的資 由於在一時間區間内的晝框相當多,例如一秒達3〇張邊 [S3 11 201108005Then, referring to step 350', according to the 3D motion vector map (3D-MV Map) constructed by the motion vector (MV), the correlation value can be obtained after the operation, and the related video files can be found according to the correlation value arrangement. And = display the results according to the degree of relevance. After selecting the video building as the search condition, due to its name or film format, even the size and quality of the face are different, but if it is the same film, the plot is the same, so its dynamic vector (MV) The distribution will be the same or similar, as shown in Figure 7A or 7B. Therefore, as long as the search index (SearchIndex) is created for the selected rhythm project, a movie with similar plots can be found. In one of the implementation examples, for the manner in which the search index is created, a moving vector map (2D: MV Map) can be obtained based on the obtained movement (MV) 'stored in the operation matrix'. According to the selected interval of the subtraction case, if you are 30 seconds or one minute, and according to the continuous display of the frame (Ram produces 21 different motion vector maps, this considers the relationship of the time axis to create a 3-call) The movement of the D-moving vector map (3D_MV Map) is quite large due to the number of frames in a time interval, for example, 3 seconds in one second [S3 11 201108005
P52980146TW 33280twf.doc/I ^向t分鐘可達麵張畫框。因此,若是將其對應的移 間U ’將會需要大量的計算,可能會延遲處理的時 針對不同的視訊齡,賴其名稱或是影#格式 小與品質有所不同,但若為同—部影片, 疋相同的,所以其動態向量(MV)分布會相同或類似,如^ A或7B所示。也就是在時間軸上晝面的變化不大。因 可^選擇㈣以-定關(例如1:2、1:4、1:N,N為整 取 =對應畫框(Frame)的移動向量(Mv)值。因此,所建立的 3-D移動向量地圖可以達到一定精確程度即可。P52980146TW 33280twf.doc/I ^ Go to the face frame in t minutes. Therefore, if the corresponding shift U 'will require a lot of calculations, it may delay the processing for different video ages, depending on the name or shadow # format and quality, but if the same - The movie, 疋 the same, so its dynamic vector (MV) distribution will be the same or similar, as shown in ^ A or 7B. That is to say, the change in the face on the time axis is not large. It is possible to select (4) to - (for example, 1:2, 1:4, 1:N, N is the whole movement = the movement vector (Mv) value of the corresponding frame (Frame). Therefore, the established 3-D Moving a vector map can achieve a certain degree of precision.
处而,’在本發明實施範例其中之一,根據所述3七 向1地圖進行躺,而找出相關的視訊職。並且根 據相關的程度而顯示搜尋的成果。在—實施例中,可以將 所選擇的視訊4#案(也就是檢索標的)的3_D移動向量地圖 中所有微方塊(Micro Block,MB)的MV值,與所有欲進行 檢索的視浦案的3_D移動向量地圖中所對應的微方塊 (Micro Block,MB)的MV值進行差值運算,而後依照運算In the meantime, one of the embodiments of the present invention lays down according to the 3-7 map, and finds the relevant video job. And the results of the search are displayed according to the degree of relevance. In an embodiment, the MV values of all the micro blocks (MBs) in the 3D moving vector map of the selected video 4# case (that is, the search target) can be compared with all the visual cases of the image to be retrieved. The MV value of the corresponding micro-block (Micro Block, MB) in the 3_D mobile vector map is subjected to difference calculation, and then according to the operation
差值分佈情況進行比對,並根據比對的結果得到相關聯 (forrelation)的程度,作為顯示結果的依據 。例如,在一實 施例中,可以根據兩個進行比較的視訊檔案,其第N個畫 框(Frame)的移動向量值,以及另一個的視訊檔案第N個晝 框,移動向量值進行例如均方根(R〇〇t_Mean Square,RMS) =算或是差值絕對值的計算,而取得差值(Distance),根據 這二差值的为佈作為相關聯(C〇rreiati〇n)程度的結果,並顯 示得到的結果。 12 201108005The difference distribution is compared and the degree of correlation is obtained according to the result of the comparison as a basis for displaying the results. For example, in one embodiment, according to two video files for comparison, the motion vector value of the Nth frame, and the Nth frame of the other video file, the motion vector value is performed, for example, R〇〇t_Mean Square (RMS) = calculation or calculation of the absolute value of the difference, and obtain the difference (Distance), according to the difference of the two differences as the degree of association (C〇rreiati〇n) The result is displayed and the results obtained are displayed. 12 201108005
P52980146TW 33280twf.doc/I 在本發明所提出新的視訊搜尋的方法,在另一 例中,請參照圖4的流程圖,用以詳細說明如何建^一個 内容作為搜尋條件的技術,達到以視訊找視訊的搜 =先,步驟彻中,開始進行叫訊找視訊的摔作。 擇其中—段視訊槽案作為檢索資訊(c^y 體播』m2"擇方式不4疋透錢用者透過視訊播放軟 =案,或歧由-額者界面,在使时 m,自動選取其中一段固定或特定期間的視訊槽 二二U可達到祕其巾—時段視賴案的方式,皆 在此棘例可運用的範圍中。 在選擇作為檢索條件的視訊檔案後,如步驟430, 3名稱或是影片格式’甚至晝面大小與品質有所不同, 因此’例如對作為檢索條件的視訊餘収錢行檢 =有視訊_,可選擇性地經由格式轉換而轉換為具有相 =格式的視案,但此步驟若是在視訊齡格式相同的 條件下,是不需要進行。 由於本實施例加入了時間軸(Time Domain)的特徵,因 此L可在所欲進行檢索的所有視鋪射找ώ相同時間轴 對應視訊檔案’而後’經由運算轉換為具有相同格 式的視訊標$。而此相同格式,是可以經由事先預定的格 式,例t所有使用此方法中統一運用的格式,或是系統業 者所預疋的格式冑可,以能夠最佳化搜尋效率與結果 計上主要的考量。 ‘… 13 [s 201108005P52980146TW 33280twf.doc/I In the new video search method proposed by the present invention, in another example, please refer to the flowchart of FIG. 4, which is used to explain in detail how to construct a content as a search condition technology, and to find a video. Video search = first, the steps are complete, start to call for video to find the fall. Select one-segment video slot case as search information (c^y body broadcast) m2" select mode is not 4 疋 money users use video to play soft = case, or disagreement - the amount of the interface, in the time m, automatically select One of the fixed or specific period of the video slot 22 can reach the scope of the secret-time-dependent case, which is within the scope of the spine. After selecting the video file as the search condition, as in step 430, 3 name or video format 'even the size and quality of the face are different, so 'for example, the video credit check as the search condition = video_, can be selectively converted to have phase = format via format conversion Vision, but this step is not required if the video age format is the same. Since this embodiment adds the feature of the Time Domain, L can be used for all the viewings of the desired search. Find the video track corresponding to the same time axis 'and then' through the operation to convert to the video symbol $ with the same format. This same format can be pre-scheduled in the format, for example, all use in this method Format, the system operators or the pre-format piece goods can helmet, to be able to optimize the efficiency of the search results on the basis of the main considerations. '... 13 [s 201108005
P52980146TW 33280twf.doc/I 在此3把例+所有欲進行檢 在資料庫432中,例如可以b 检案可以疋存 的主機資料庫、或是雲;^;路== 擎的主機或是雲端運算系統進行皆可。 搜+引 在步驟440中,對所有或部分視 過壓縮的資料,其位元击、、、木通吊為已經 並且可以堂挥14从串* )進行剖析(ParSin§), 了、擇性的以-定比例(例如1:2、1.4或是⑼,豆 二义晝二T )取得對應晝框—的移躺量_ =其目的疋為了箱難時_之取樣率。本實施例將 案,括作為檢索條件的視訊檔案)轉換為相同 格式的用忍’在於所有視訊槽案的移動向量。也就是說, ,所有壓縮的視訊職取出其移動向量,據以建立檢索索 51 ° 7 而對於不同檢索條件下的解析度,也就是不同解析度 的移動向*,在本實施例中,可以利用上調取樣 (Up-Samplmg)或是下調取樣(D__Sampling)的方式進行 改,。例如,一般視訊檔案是由很多連續的晝框所組成, 而每個晝框(Frame)是由很多個微區塊_)編碼而成,而每 個微區塊MB為例如是16x16為單位,而對於每個微區塊 MB具有的移動向量,有可能一個,也可能具有16個,對 於不同的格式有不同的移動向量數量。而若是為了統一解 析度,則必須將每個微區塊MB所具有的移動向量數量調 整成一致,在一實施例中,若是n個移動向量調整成一個, 201108005P52980146TW 33280twf.doc/I Here are 3 examples + all that are to be checked in the database 432, for example, the host database that can be saved by the b check, or the cloud; ^; road == the host or the cloud of the engine The computing system can do it all. Search + in step 440, for all or part of the compressed data, the bit hit, ,, Mutong hanging as already and can be parsed from the string *) (ParSin§), and selectivity The ratio of the grading of the corresponding frame (for example, 1:2, 1.4 or (9), Bean II) is the sampling rate of the container. In this embodiment, the video file as a search condition is converted into the same format, and the mobile vector of all the video slots is used. That is to say, all compressed video jobs take out their motion vectors, based on which the retrieval line 51 ° 7 is established and the resolution for different retrieval conditions, that is, the movement direction of different resolutions, in this embodiment, Change it by up-sampling (Up-Samplmg) or down-sampling (D__Sampling). For example, a general video file is composed of a plurality of consecutive frames, and each frame is encoded by a plurality of micro-blocks, and each micro-block MB is, for example, 16x16. For each micro-block MB, there is a possibility that there may be one or six motion vectors, and there are different numbers of motion vectors for different formats. However, in order to unify the degree of resolution, the number of motion vectors that each micro-block MB has must be adjusted to be consistent. In one embodiment, if n mobile vectors are adjusted to one, 201108005
里的值做一平均數的 另外, 若是要將僅有1個Ω . 向量的作法, 向量,例如:The value in the value is an average. In addition, if there is only one Ω. Vector, the vector, for example:
^ 1 ^v^Jiuup υι rictures > GOP) ^ 例?在MPEP-4協定處理連續動態影像時,為達到比較好 的壓縮效果’會定義此晝面群組(GQP),可以獲得影像資 料内的隨機存取動作’例如在ΜΡΕίΜ協定是包括九張畫 面(一個I晝面、兩個向前預測的Ρ晝面與六個雙向預測的 Β旦面)。因此,若欲知道採用那個移動向量數量比較適 合,則在一例子中,可以晝面群組(GOP)為基本單位,針 對其微區塊MB的移動向量所屬的區塊尺寸(B 1〇(± size)所 佔的比率,以例如—臨限值(Thresh〇ld)來決定採用那個數 量的移動向量進行建立檢索索引。 例如’在統計晝面群組(GOP)内,具有底下的區塊尺 寸(Block Size)及其所佔的比率: BS=16xl6 佔了 5〇〇/0 BS=16x8 佔了 15〇/0 BS=8x8 佔了 25〇/〇 BS=8x4 佔了 3% 15 201108005^ 1 ^v^Jiuup υι rictures > GOP) ^ Example? When the MPEP-4 protocol processes continuous motion pictures, in order to achieve a better compression effect, the surface group (GQP) will be defined, and random access actions in the image data can be obtained. For example, the ΜίΜ agreement includes nine pictures. (One I face, two forward predicted faces and six bidirectionally predicted faces). Therefore, if it is desired to know the number of mobile vectors to be used, in one example, the face group (GOP) can be the basic unit, and the block size to which the motion vector of the micro block MB belongs (B 1〇( ± size) The ratio, for example, Thresh〇ld, is used to determine the number of motion vectors used to establish the search index. For example, 'in the statistical face group (GOP), there is a block below. Block Size and its ratio: BS=16xl6 accounted for 5〇〇/0 BS=16x8 accounted for 15〇/0 BS=8x8 accounted for 25〇/〇BS=8x4 accounted for 3% 15 201108005
P52980146TW 33280twf.doc/I BS=4x4 佔了 7% 在此例子中,將臨限值(Thresh〇ld)訂在至少5〇%,因 此,區塊尺寸為16x16就佔了 50%,因此符合條件,於是 就選,區塊尺寸BS=16xl6,·以此位元串流(Bit_m)的移 動向罝作為接下來產生移動向量地圖解析度⑴叩 Resolution)的大小。也就說不管原MB中的Mv有幾個, 根據娜=叫=^ =泌>规4 = .__. = ^方式,都調整成一個。 但若是沒有任何-健塊尺寸所佔的比率朗臨界值,則 可以改以某-較大小的區塊尺寸,例如Bs=恤Μ中的 移動向量作為接下來產生移動向量地圖解析度的大小,並 接著進行調整;也就是若MB中有9個Mv值,則檢 件中的MV也就調整成9個,且區塊尺寸對應大小也要相 同。 _ ί參照步驟45G,根據移動向量(MV)建構具有時 間轴>、料的3D移動向量地圖(3D_MV Map),經 工可得到關聯值,而根據關連值排列即可找出相關視二的 植案,並且依照相關程度顯示其結果。 在選擇作為檢索條件的視訊樓案後,由於 曰 影片格式’甚至晝面大小與品質有所不同,但^ = 影片’其劇情是相同的,所以其動態向量(MV)分布會相^ 或類似,如圖7A或7B所示。所以,只要針對所選士視 =^建立檢索索引(Sea]xh Index),即可找 相似劇情的影片。在實施範例其中之―,對於^= (Search Index)建立的料,可以根據所取得的移動= 201108005P52980146TW 33280twf.doc/I BS=4x4 accounted for 7%. In this example, the threshold (Thresh〇ld) is set at least 5〇%, so the block size is 16x16, which is 50%, so it is eligible. Then, the block size BS=16xl6 is selected, and the movement of the bit stream (Bit_m) is used as the size of the moving vector map resolution (1)叩Resolution. That is to say, regardless of the number of Mvs in the original MB, according to Na = call = ^ = secretion > gauge 4 = .__. = ^ way, are adjusted to one. However, if there is no threshold value of the ratio of the health block size, then the block size of a certain larger size, for example, the motion vector in the Bs=shirt, can be changed as the magnitude of the resolution of the moving vector map. Then, the adjustment is performed; that is, if there are 9 Mv values in the MB, the MV in the checker is also adjusted to 9 and the block size is also the same. _ ί Referring to step 45G, a 3D motion vector map (3D_MV Map) having a time axis > material is constructed according to the motion vector (MV), and the associated value can be obtained through the work, and the related view can be found according to the correlation value. Plant the case and display the results according to the degree of relevance. After selecting the video building as the search condition, since the video format 'even the size and quality of the picture is different, but ^ = the movie's plot is the same, its dynamic vector (MV) distribution will be similar or similar As shown in Fig. 7A or 7B. Therefore, as long as you create a search index (Sea]xh Index for the selected Shishi =^, you can find a similar story. In the implementation example, the material created for ^= (Search Index) can be based on the obtained movement = 201108005
P52980146TW 33280twf.doc/I 圖 例如: (2,3) (2,5,2) (2,4) ~〇J) (4,3) T4J)^ (6,4) '(5,2) 在5) 而根據所選擇的視訊檔案的時間區間,例如是% I、 或是-分鐘,而根據連續顯示的晝框產生不同的^ ^ 向量地圖,此考慮時間軸的關係,即可建立一個 = 移動向量地圖(3D-MV Map)的資料。但由於在二_ 内的晝面相當多,例如一秒達30張晝面,而—八二二二 1800張畫面。因此,若是將其對應的移動向量填^里j 需要大量的計算,可能會延遲處理的時間。〃 :曰 針料同的視域案,雜其名稱或是糾 至旦面大小與品質有所不同,但若為同一彡 甘 J相同的’所以’其動態向量_分明類似劇: 圖7A或7B所示。也就是在時間轴上晝面的變^= j 此’可以選擇性的以-定比例(例如1:2、1.4、⑼ 整數)取得對應晝框的移動向量_值。因此,所^ 3-D移動向量地圖可以達到一定精確程度即可。所建立的 據所、施範例其中之—,如步驟460,根 據所达3-D移動向I地圖進行判斷 ,。並且根據相關的程度而顯示搜在一::: 中,可以所選擇的視訊賴也就是檢索標的)以 [S ] 17 201108005P52980146TW 33280twf.doc/I Figure for example: (2,3) (2,5,2) (2,4) ~〇J) (4,3) T4J)^ (6,4) '(5,2) 5) According to the time interval of the selected video file, for example, % I, or - minute, and according to the continuously displayed frame, different ^ ^ vector maps are generated, and considering the relationship of the time axis, a = Moving vector map (3D-MV Map) data. However, due to the considerable number of faces in the second _, for example, 30 seconds in one second, and 1800 pictures in 8222. Therefore, if the corresponding motion vector is filled in, it requires a lot of calculations, which may delay the processing time. 〃 : The same field of view of the 曰 料 , , 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂 杂Shown in 7B. That is, the change of the face on the time axis ^= j This can selectively obtain the motion vector_value of the corresponding frame by a proportional (for example, 1:2, 1.4, (9) integer). Therefore, the 3-D moving vector map can achieve a certain degree of precision. The established data base and the application example are, as in step 460, based on the 3-D movement to determine the I map. And according to the degree of correlation, the search in the one:::, the selected video will be the target of the search) [S ] 17 201108005
P52980146TW 33280twf.doc/I 向量地圖中所有微方塊⑽⑽則峨刚的猜值,與所 .I. j. nv/T 攀 圖中所對應的 微方塊⑽⑽別峨㈣㈣猜值進行差值運算而後根 據其分佈情況進行比對,絲據比料結果得到相關聯 (―)的程度,作為顯示結果的依據。例如,在一實 施例中可以根據兩個進行比較的視訊職,取其中一個 的視訊,案第N個晝框(F職雜第個晝框(Ffame)的 移動向量差值及另—個的視賴鮮N健框的_) 與第Ν·1個晝框(Frame)的移動向量差值,兩者進行例如均 ^根(Root-Mean-Square,RMS)計算或是差值絕對值的計 算’而取得差值(Distance),轉這些差值的分佈作為相關 聯(Correlation)程度的結果,並顯示得到的結果。 例如差值(Distance)為底下的值: D = ^frame{n)MV ~frame{n - 1)Λ#"]2 也就是採用均方根的計算方式,或是:P52980146TW 33280twf.doc/I All the micro-squares (10)(10) in the vector map are the same as the micro-squares (10) (10) corresponding to the .I. j. nv/T climbing diagram. (4) (4) The difference is guessed and then calculated according to the difference. The distribution is compared, and the degree of correlation (-) is obtained according to the results of the comparison, as the basis for displaying the results. For example, in one embodiment, according to two video games for comparison, one of the video frames, the Nth frame of the case (the Ffame) movement vector difference and another The difference between the _) of the fresh N box and the movement vector of the first frame of the frame is calculated by the root-Mean-Square (RMS) or the absolute value of the difference. Calculate 'and get the difference (Distance), turn the distribution of these differences as a result of the degree of Correlation, and display the results. For example, the difference is the value underneath: D = ^frame{n)MV ~frame{n - 1)Λ#"]2 is the calculation method using the root mean square, or:
D = \frame{n)MV - frame(n - \)MV 根據所計算得到的差值(Distance),其不同移動向量的 差值分佈情況’即可得到與檢索標的相關聯程度,並根據 排序而可得到結果,並可找出最相關的例如前十個或是前 二十個視訊檔案’而將其結果顯示回應給使用者。 201108005D = \frame{n)MV - frame(n - \)MV According to the calculated difference (Distance), the difference distribution of the different motion vectors' can be correlated with the search target and sorted according to The results can be obtained, and the most relevant, for example, the top ten or the top twenty video files can be found and the results displayed in response to the user. 201108005
P52980146TW 33280twf.doc/I 本實施所提出新的視訊搜尋的方法,可以運用在軟體 上的設計。在另-實施财,亦可運用在#赠上的設計, 例如搭配部分的硬體,以及合軟虹的料與操作,亦 可運用本實施所提⑽對網路上進行視職尋的方法。另 外,亦可將本實施所提出新的視訊搜尋方法,部分運算放 置在個人主機或區域網路,而另外部分則是可放置在^端 的系統,例如搜尋引擎的主機或是雲端的設計。P52980146TW 33280twf.doc/I This method proposes a new video search method that can be applied to software. In other implementations, you can also use the design of the gift, such as matching the hardware of the part, and the material and operation of the soft rainbow. You can also use the method of (10) to perform the job search on the network. In addition, the new video search method proposed in the present embodiment may be placed on a personal host or a local area network, and the other part may be a system that can be placed on the end, such as a search engine host or a cloud design.
本實施所提出新的視訊搜尋的方法,可以採用如圖1A 〜1C的方式,先採用一粗略選擇(c〇arse Seiecti〇n)的方 ,,先找到多個相關的視訊檔案,回應給使用者,而後接 著再以一精確選擇(Fine Selection)的方式(更近一歩提高比 對條件)’找到更相關的視訊,此亦為本實施例的應用方式 之一。 而針對硬體上的應用,在多個實施例其中之一,可以 參照圖5A的電路方塊圖示說明,但並非以此為限制。 在此電路應用中,包括串流剖析器(streamThe new video search method proposed in this implementation can adopt the method of roughly selecting (c〇arse Seiecti〇n) in the manner of FIG. 1A to FIG. 1C, first finding a plurality of related video files, and responding to the use. Then, a more relevant video is found in a way of Fine Selection (closer to improve the comparison condition), which is also one of the application modes of the embodiment. For hardware applications, one of the various embodiments may be illustrated with reference to the circuit blocks of FIG. 5A, but is not limited thereto. In this circuit application, including stream parser (stream)
Parser)530 ' 3-D移動向量(3D-MV)地圖產生器540與3-D 移動向量(3D-MV)地圖比較器550。 在串流剖析器530的架構中,可對多個已經過壓縮的 視訊檔案的位元串流(Bitstream)進行剖析(Parsing),並且取 出其移動向量(MV)資料。 而對於不同檢索條件下的解析度,也就是不同解析度 的移動向量,_流剖析器530可以利用上調取樣 (Up-Sampling)或是下調取樣(Down-Sampling)的方式進行 改變。另外,為了統一搜尋索引(SearchIndex)的解析度, 19 [£1 201108005Parser) 530 ' 3-D motion vector (3D-MV) map generator 540 and 3-D motion vector (3D-MV) map comparator 550. In the architecture of the stream parser 530, bitstreams of a plurality of already compressed video files can be parsed and their motion vector (MV) data can be fetched. For the resolution under different retrieval conditions, that is, the motion vectors of different resolutions, the stream parser 530 can be changed by using up-sampling or down-sampling. In addition, in order to unify the search index (SearchIndex) resolution, 19 [£1 201108005
P52980146TW 33280twf.doc/I 串、:L剖析器53G可以將每個微區塊MB所具有的移動 數量調整成-致’例如若欲將η娜動向量調整成里 可^採用例如平均法,將η歸動向量的值做—平的 ^算。而若是要將僅有丨個移動向量轉為具有_移動向 =的作法,相將1個移動向量轉個相同值的移動二 另外’串流剖析器530可以針對如何選擇微區塊ΜΒ 所具有的移動向量數量,採祕計法完成。例如,以 群組(咖)為基本單位,針對其微區塊娜的移動^ 屬的區塊尺寸(Block Size)所佔的比率,以例如一臨限值 (Threshold)來決定採用那個數量的移動向量進行建立檢 索引。P52980146TW 33280twf.doc/I string, :L parser 53G can adjust the number of movements of each micro-block MB to -" For example, if you want to adjust the η-na motion vector to 里, use, for example, the averaging method, The value of the η homing vector is done as a flat ^ calculation. However, if only one moving vector is to be converted to have a moving direction =, one moving vector is transferred to the same value of the moving two additional 'stream parser 530 can be used for how to select the micro block ΜΒ The number of moving vectors is done by the secret method. For example, in the group (coffee) as a basic unit, the ratio of the block size of the mobile device of the micro-block is determined by, for example, a threshold (Threshold). The motion vector is used to establish a retrieval index.
而3D-MV地圖產生n 54G則可簡軸向量建構具 有時間軸資料的3D移動向量地圖(3D_MV施的。此 MV地圖產生益540根據所取得的移動向量(MV),存 入一運算矩陣中,而得到一個2_D移動向量地圖(2d_mv Map)。而根據所選擇的視訊檔案的時間區間,以及連續顯 示的晝框(Frame)產生不同的2_D移動向量地圖,再加入& 間軸的參數,則可建立-個稱為3七移動向量地圖的資料。 由於在一時間區間内的晝面相當多,因此,3D_Mv 地圖產生器540可以選擇以一定比例(例如& 1:4、1:n, N為整數)取得對應晝面(prame)的移動向量(mv)值。因 此’所建立的3-D移動向量地圖可以達到所設定的精確程 度即可。在-κ&例中,亦可採用兩階段,包括粗略選擇 20 201108005The 3D-MV map generates n 54G, and the simple axis vector constructs a 3D motion vector map with time axis data (3D_MV applied. This MV map generates benefit 540 according to the obtained motion vector (MV), and stores it into an operation matrix. In the middle, get a 2_D moving vector map (2d_mv Map), and generate different 2_D moving vector maps according to the time interval of the selected video file and the continuously displayed frame, and then add the parameters of the & Then, a data called a three-seven motion vector map can be created. Since there are quite a lot of faces in a time interval, the 3D_Mv map generator 540 can select a certain ratio (for example, & 1:4, 1: n, N is an integer) to obtain the motion vector (mv) value of the corresponding prame. Therefore, the established 3-D motion vector map can achieve the set accuracy. In the -κ & Two stages can be used, including a rough selection of 20 201108005
P52980146TW 3328Otwf. doc/I (Coarse Selection)的方式,以及再以一精確選擇阳加 Selection)的方式,找到最相關的視訊。 曰而3D-MV地圖比較器550則是根據所述ίο移動向 置地圖進行判斷,而找出相關的視訊檔案。並且根據相關 的程度而顯示搜尋的成果。此3D-MV地圖比較器55〇連 接到網路552或是資料庫554以讀取3D-MV地圖。此 3D-MV地圖比較器55〇可以根據兩個進行比較的視訊檔 案,取其中一個的視訊檔案第N個晝框(Frame)與第 個畫輕(F職e)的弟動向量差值,以及另一個的視訊檔案第 N個旦框(Frame)與第N-1個晝框(Frame)的移動向量差 值’兩者進行例如均方根(R〇〇t-Mean_Square,RMS)計算 是差值絕對值的計算’而取得錄__),根據這=差 5佈:J相關聯(Correlati〇n)程度的結果,並將得到的結 果鮮員不在顯示器560上。 ㈣it施例所爾的電路中,亦進—步包括視訊格式 ΐ =古、目’對作為檢索的視訊檔案51G ’以及所欲進行檢 找出相同時間軸區段的對應視訊播 Ϊ選樹生‘算運同格式的視訊樓案。此 ;r標案,但若是===== 直接將作為檢索的視訊職51〇傳二 從播案得;實施範例中’* —’此選擇方式不論是透過使 201108005 P52980146TW 33280twf.d〇c/l 二 ΐ 機 体用能控制裝置’其中功能控制裝置用以售 播放的視訊資料中的其中-段= 電視機,而控制;實施例中’此主機可以是 視機。在另.外!^3線紐器’用以無線方式控制電 控制器為無線或是有_滑鼠。 胃主機,而 例如’清參照圖5Β,可將擇遮一 ip. sy 於控制影片播放的遙控哭通中能内建 播放時,枯田h 中而當影片在電視57〇中 T2結束,以開始’而到第二時間 却以m 間的這段影片作為檢索資 訊。而在另外一個實施範例中,亦可透過個人你田 者輸入界面,如滑鼠或是觸控螢幕上的使用者界面 =者以方便且簡單的方式停住一端時間的影片作為檢索資 中所具有的移動 ; 口⑽⑴疋在視訊職t,根據時 一系列連續的晝框,以及其移動向量的示意圖。而這些晝 框 620、622、624、626 與 628,則可 LV 你 ^— 到的晝面縣(⑽)。 林實施例所提 圖6C則是顯示微區塊趣的移動向量所屬的不同區 塊尺寸(Block Size),例如630則顯示區塊尺寸為ΐ6χΐ6, 22 201108005P52980146TW 3328Otwf. doc/I (Coarse Selection) and the way to find the most relevant video in a precise way. The 3D-MV map comparator 550 determines the relevant video file based on the ίο moving map. And the results of the search are displayed according to the degree of relevance. The 3D-MV map comparator 55 is connected to the network 552 or the database 554 to read the 3D-MV map. The 3D-MV map comparator 55 can take the difference between the Nth frame of the video file and the disco vector of the first picture (F job e) according to the two video files being compared. And another video file, the Nth frame (Frame) and the N-1th frame (Frame) movement vector difference 'both, for example, the root mean square (R〇〇t-Mean_Square, RMS) calculation is The calculation of the difference absolute value 'and the record __), according to this = difference 5 cloth: J correlation (Correlati〇n) degree results, and the results obtained are not on the display 560. (d) In the circuit of the example, it also includes the video format ΐ = ancient, the visual 'for the searched video file 51G ' and the corresponding video frame selection for the same time axis segment. Calculate the video building in the same format. This; r standard, but if ===== will directly be used as a search for video posts 51 rumors from the broadcast case; in the implementation example '* -' this choice is made by making 201108005 P52980146TW 33280twf.d〇c /l 二 机 The body can use the energy control device 'where the function control device is used to sell the video data in the segment - TV, and control; in the embodiment, the host can be a video camera. In the other! ^3 wire connector 'wireless control of the electric controller is wireless or has _ mouse. The stomach host, for example, 'clearly refer to Figure 5Β, can choose to cover an ip. sy in the control of the video playback of the remote control crying can be built-in playback, when the film is in the TV 57〇T2 end, to At the beginning of the second time, the video between m was used as the retrieval information. In another implementation example, the user interface of the user, such as a mouse or a user interface on the touch screen, can be used to conveniently and simply stop the video at one end as a search for the user. There is a movement; mouth (10) (1) 疋 in the video job t, according to a series of consecutive frames, and a schematic diagram of its motion vector. And these frames 620, 622, 624, 626, and 628 can be used by LV you to reach the county ((10)). Figure 6C shows the different block sizes (Block Size) to which the mobile vector of the micro-blocks belongs. For example, 630 shows that the block size is ΐ6χΐ6, 22 201108005
P52980146TW 33280twf.doc/I 而具有-個移動向量。631則顯 有二個移動向量。632則 塊尺寸為,而具 633則顯示區塊尺寸為δχ8個不同^向的區境尺寸 向量。634則顯示區塊尺寸為固^*有四個移動 rnr 而具有一個移動向晋。 6 5 ^ 636則分別為不同方向的區塊尺寸8心糾,八 移動向量。❿637則顯示區塊尺寸為4x4,: 個具有一個移動向量。 甘P52980146TW 33280twf.doc/I has - a motion vector. 631 shows two moving vectors. 632 has a block size of , and 633 shows a block size of δ χ 8 different zonal size vectors. 634 shows that the block size is fixed ^* has four moving rnr and has a moving direction. 6 5 ^ 636 is the block size of 8 different directions in different directions, and eight moving vectors. ❿637 shows that the block size is 4x4, and: has a motion vector. sweet
請參照圖7Α與圖7Β,分別為相同電影(片名為 ㈣’但具有不同麟度,如高解析度(m㈣efmiti〇n, HD)的影片片段71G與較低解析度的通用影像格式 (Common lmage Format,ciF)格式的影片片段73〇(相同時 •k 4分18秒)。而根據影片的劇情(也就是時間軸的變化), 分別有如圖示720、722、724、726、72δ的圖框,以及740、 742、744、746、748的圖框。但從此内容可知,針對不同 的視訊槽案’雖然其名稱或是影片格式,甚至晝面大小與 品質有所不同,但若為同一部影片,其劇情是相同的,也 就是在時間軸上晝面的變化不大。 在具體驗證的實例中,請參照圖8Α,分別找出影片 A(QCIF格式)810、影片A(CIF格式)820與另一個不相關 的影片B(CIF格式)830。此通用影像格式(Common Image Format,CIF)—般大小為352x288晝素,而QCIF則稱為 四分之一通用影像格式(Quarter Common Image Format, QCIF),一般大小為176x144畫素。而採用本實施所提出 針對網路上進行視訊搜尋的方法計算後,得到的差值如圖Please refer to FIG. 7A and FIG. 7Β, respectively, for the same movie (the film name is (4)' but with different linings, such as high resolution (m(four) efmiti〇n, HD) movie segment 71G and lower resolution general image format (Common In the lmage format, ciF format, the movie clip 73〇 (the same time • k 4 minutes and 18 seconds). According to the plot of the movie (that is, the change of the time axis), there are 720, 722, 724, 726, 72δ as shown in the figure. Frame, and the frame of 740, 742, 744, 746, 748. However, as can be seen from this content, for different video slots, although the name or video format, even the size and quality of the face are different, but if In the same film, the plot is the same, that is, there is little change in the timeline. In the specific verification example, please refer to Figure 8Α to find the movie A (QCIF format) 810 and the movie A (CIF). Format) 820 is associated with another unrelated movie B (CIF format) 830. This common image format (CIF) is 352x288 pixels, and QCIF is called a quarter-universal image format (Quarter). Common Image Format, QCIF), general size It is 176x144 pixels. After using the method proposed in this implementation to calculate the video on the network, the difference is shown in the figure.
23 20110800523 201108005
P52980146TW 33280twf.doc/I 8B所示’標號812為影片A(QCIF)的差值,標號822為影 片A(CIF)的差值,標號832為影片B(CIF)的差值。從此結 果可以瞭解,影片A(QCIF)的差值分佈與影片a(CIF)的差 值十分類似,但前兩者與影片B(CIF)的差值差異甚大,因 此可以找出其關聯程度。 另外’请分別參照圖8C、8D與8E,分別為影片a(QCIF 格式)、影片A(CIF格式)與影片b(CIF格式)在23=8取一 張的條件下,所得到的3D-MV地圖分佈,則可以更清楚 看出影片A(QCIF)的3D-MV地圖分佈與影片A(CIF)的 3D-MV地圖分佈十分類似,但前兩者與影片b(cif)的 3D-MV地圖分佈差異甚大。 從上述實證結果可知,本實施所提出新的視訊搜尋的 方法,可根據計算得到的差值分佈,即可得到與檢索標的 相,聯程度的值,並根據排序而可得到結果,而可將其結 果顯示回應給使用者。 ^ 雖然本發明已以實施例揭露如上,然其並非用以限定 本發明,任何所屬技術領域巾具有通#知識者,在不脱離 本發明之精神和範圍内,當可作些許之更動與_,故本 發明之保護當視制之巾請糊_所界定者為準。 【圖式簡單說明】 圖1A〜1C是習知的-種以名稱找影像的檢索方法示 圖。 圖2是顯示MPEP_7標準中對於視聽内容(av &咖〇 24 201108005P52980146TW 33280twf.doc/I 8B shows that the reference numeral 812 is the difference of the movie A (QCIF), the reference numeral 822 is the difference of the movie A (CIF), and the reference numeral 832 is the difference of the movie B (CIF). From this result, it can be understood that the difference distribution of the film A (QCIF) is very similar to the difference of the film a (CIF), but the difference between the first two and the film B (CIF) is very different, so the degree of association can be found. In addition, please refer to Figures 8C, 8D and 8E respectively, respectively, for the film a (QCIF format), the film A (CIF format) and the film b (CIF format) under the condition that 23=8, the obtained 3D- MV map distribution, it can be more clearly seen that the 3D-MV map distribution of film A (QCIF) is very similar to the 3D-MV map distribution of film A (CIF), but the former two and the 3D-MV of the film b (cif) The map distribution is very different. It can be seen from the above empirical results that the new video search method proposed in the present embodiment can obtain the value of the degree of association with the search target according to the calculated difference distribution, and can obtain the result according to the sorting, and can obtain the result. The result shows a response to the user. Although the present invention has been disclosed in the above embodiments, it is not intended to limit the present invention, and any of the technical fields of the present invention can be modified without departing from the spirit and scope of the present invention. _, therefore, the protection of the present invention is subject to the definition of the towel. BRIEF DESCRIPTION OF THE DRAWINGS Figs. 1A to 1C are diagrams showing a conventional search method for finding an image by name. Figure 2 shows the audiovisual content in the MPEP_7 standard (av & curry 24 201108005
P52980146TW 33280twf.doc/I 與内谷描述(Content Description)的關聯示意圖。 圖3疋s兑明本發明一實施範例之針對網路上逸杆讳 搜尋的方法流程圖。 圖4是說明本發明另一實施範例之針對網路上進行視 訊搜尋的方法流程圖。 圖5A是說明本發明一實施範例之電路實施方塊示音 圖。 、‘处 圖5B是說明本發明一實施範例中選取檢索影片片段 的應用方法示意圖。 ’、 又 圖6A是說明在一圖框中所具有的移動向量示意 圖0 ^圖6B則是在視訊檔案中,根據時間軸而有一系列連 續的晝框’以及其移動向量的示意圖。 圖6C則是顯示微區塊MB的移動向量所屬的不同區 塊尺寸(Block Size)示意圖。 圖7A與圖7B是分別說明相同電影而不同解析度之片 段與顯示的連續圖框示意圖。 圖8A疋分別找出不同影片,包括影片正格式)、 影片A(CIF格式)與影片B(CIF格式)的示意圖。 圖犯則是說明如圖8A中的三種不同影片經過本實施 所提出針對網路上進行視訊搜尋的方法計算後,得到的差 值示意圖。 以圖8C 8D與8E,分別為圖8A的影片a(qCIF格式)、 如片士 (CIF格式)與景多片B(CIF格式)所得到的犯_游地圖 分佈示意圖。P52980146TW 33280twf.doc/I Schematic diagram of association with the Content Description. FIG. 3 is a flow chart of a method for searching for a break on the network according to an embodiment of the present invention. 4 is a flow chart showing a method for performing video search on a network according to another embodiment of the present invention. Fig. 5A is a block diagram showing the circuit implementation of an embodiment of the present invention. </ RTI> FIG. 5B is a schematic diagram showing an application method for selecting a retrieval movie segment in an embodiment of the present invention. Figure 6A is a schematic diagram showing the motion vector in a frame. Figure 0 is a schematic diagram showing a series of consecutive frames and their motion vectors in the video file according to the time axis. Fig. 6C is a diagram showing a different block size (Block Size) to which the motion vector of the micro-block MB belongs. 7A and 7B are schematic diagrams showing successive blocks of a segment and a display of different resolutions for the same movie, respectively. Figure 8A shows a schematic diagram of different movies, including the film's positive format, movie A (CIF format) and movie B (CIF format). The figure is a schematic diagram showing the difference values obtained by the three different films in Fig. 8A calculated by the method for video search on the network. 8C 8D and 8E are respectively a distribution map of the scam map obtained by the movie a (qCIF format) of FIG. 8A, such as the film (CIF format) and the scene multi-chip B (CIF format).
25 [S 20110800525 [S 201108005
P52980146TW 33280twf.doc/I 【主要元件符號說明】 530 :串流剖析器(stream Parser) 540 : 3-D移動向量(3D-MV)地圖產生器 550 : 3-D移動向量(3D-MV)地圖比較器 552 :網路 554 =資料庫 560:顯示器. 610、620、622、624、626 與 628 :圖框 710、730 :影片片段 720、722、724、726、728 :圖框 740、742、744、746、748 :圖框 81〇 :影片A(QCIF格式) 820 :影片A(CIF格式) 830 :影片B(CIF格式) 812 :影片a(QCIF)的差值分佈 822 :影片A(CIF)的差值分佈 832 :影片B(CIF)的差值分佈、 26P52980146TW 33280twf.doc/I [Key component symbol description] 530 : Stream parser 540 : 3-D motion vector (3D-MV) map generator 550 : 3-D motion vector (3D-MV) map Comparator 552: Network 554 = Library 560: Display. 610, 620, 622, 624, 626, and 628: Frames 710, 730: Movie Segments 720, 722, 724, 726, 728: Frames 740, 742, 744, 746, 748: Box 81〇: Movie A (QCIF format) 820: Movie A (CIF format) 830: Movie B (CIF format) 812: Difference distribution of movie a (QCIF) 822: Movie A (CIF) Difference distribution 832: difference distribution of film B (CIF), 26
Claims (1)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/804,477 US8805862B2 (en) | 2009-08-18 | 2010-07-21 | Video search method using motion vectors and apparatus thereof |
US13/077,984 US8515933B2 (en) | 2009-08-18 | 2011-04-01 | Video search method, video search system, and method thereof for establishing video database |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US23463609P | 2009-08-18 | 2009-08-18 |
Publications (2)
Publication Number | Publication Date |
---|---|
TW201108005A true TW201108005A (en) | 2011-03-01 |
TWI443534B TWI443534B (en) | 2014-07-01 |
Family
ID=43786388
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW99113963A TWI443534B (en) | 2009-08-18 | 2010-04-30 | Video search method and apparatus using motion vectors |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN101996229B (en) |
TW (1) | TWI443534B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9609277B1 (en) | 2015-12-28 | 2017-03-28 | Nanning Fugui Precision Industrial Co., Ltd. | Playback system of video conference record and method for video conferencing record |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109344321B (en) * | 2012-05-08 | 2021-11-02 | 潍坊久宝智能科技有限公司 | System for obtaining user personalized features |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5796434A (en) * | 1996-06-07 | 1998-08-18 | Lsi Logic Corporation | System and method for performing motion estimation in the DCT domain with improved efficiency |
WO2002008948A2 (en) * | 2000-07-24 | 2002-01-31 | Vivcom, Inc. | System and method for indexing, searching, identifying, and editing portions of electronic multimedia files |
CN100426329C (en) * | 2001-10-12 | 2008-10-15 | 力新国际科技股份有限公司 | System and method for generating thumbnail sequence |
JP4513819B2 (en) * | 2007-03-19 | 2010-07-28 | 株式会社日立製作所 | Video conversion device, video display device, and video conversion method |
-
2010
- 2010-04-30 TW TW99113963A patent/TWI443534B/en active
- 2010-06-29 CN CN 201010220461 patent/CN101996229B/en active Active
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9609277B1 (en) | 2015-12-28 | 2017-03-28 | Nanning Fugui Precision Industrial Co., Ltd. | Playback system of video conference record and method for video conferencing record |
Also Published As
Publication number | Publication date |
---|---|
CN101996229A (en) | 2011-03-30 |
CN101996229B (en) | 2013-11-06 |
TWI443534B (en) | 2014-07-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9264765B2 (en) | Method for providing a video, transmitting device, and receiving device | |
JP5092000B2 (en) | Video processing apparatus, method, and video processing system | |
CN108886583A (en) | For providing virtual panning-tilt zoom, PTZ, the system and method for video capability to multiple users by data network | |
KR20130108311A (en) | Video bit stream transmission system | |
JP2008312061A (en) | Image processor, image processing method, and program | |
JP2019505111A (en) | Processing multiple media streams | |
TWI378718B (en) | Method for scaling video content according to bandwidth rate | |
KR20150011562A (en) | Sports Game Analysis System and Method using Record and Moving Image for Respective Event of Sports Game | |
KR101684461B1 (en) | System for sharing picture of sport, server thereof, terminal thereof, method thereof and computer recordable medium storing the method | |
WO2020017354A1 (en) | Information processing device, information processing method, and program | |
JP2009177431A (en) | Video image reproducing system, server, terminal device and video image generating method or the like | |
TW201038074A (en) | Method for decomposition and rending of video content and user interface operating the method thereof | |
TW201108005A (en) | Video search method and apparatus using motion vectors | |
CN110198457B (en) | Video playing method and device, system, storage medium, terminal and server thereof | |
CN113938713B (en) | Multi-channel ultra-high definition video multi-view roaming playing method | |
KR20210033759A (en) | Method and Apparatus for Automatic Tracking and Replaying Images Based on Artificial Intelligence | |
TW201125358A (en) | Multi-viewpoints interactive television system and method. | |
JP5954756B2 (en) | Movie playback system | |
JP2016119590A (en) | Moving image server device and scene extraction program | |
JP7237927B2 (en) | Information processing device, information processing device and program | |
Yu et al. | Interactive broadcast services for live soccer video based on instant semantics acquisition | |
JP4946433B2 (en) | Information distribution apparatus and program | |
GB2555838A (en) | An apparatus, computer program and method | |
Liu et al. | Providing on-demand sports video to mobile devices | |
JP2015037290A (en) | Video control device, video display device, video control system, and method |