JP7126539B2

JP7126539B2 - Video playback method, terminal and system

Info

Publication number: JP7126539B2
Application number: JP2020210327A
Authority: JP
Inventors: ▲榮▼▲フイ▼ ▲賀▼
Original assignee: インターデイジタルマディソンパテントホールディングスソシエテパーアクシオンサンプリフィエ
Priority date: 2020-12-18
Filing date: 2020-12-18
Publication date: 2022-08-26
Anticipated expiration: 2032-12-25
Also published as: JP2021072630A

Description

本発明は、ビデオ監視の分野に関し、特に、ビデオ監視の分野におけるビデオ再生方法、端末、およびシステムに関する。 The present invention relates to the field of video surveillance, in particular to a video playback method, terminal and system in the field of video surveillance.

現在では、高解像度ビデオは、ビデオ監視の分野における重要な技術的トレンドになった。そして、７２０ｐまたは１０８０ｐの解像度を有するカメラが、ビデオ監視の分野で、ますます広く使用されている。カメラ解像度の絶え間ない増加によって、単一のカメラが監視することができる範囲はますます広くなり、かつ画像の細部はより明確になった。同時に、ビデオ画像に対するインテリジェント分析技術が徐々に実行に移されている。ハードウェア装置の技術開発によって、ハードウェアの性能は、同じ画像内の複数の対象領域上でのインテリジェント分析を実行するための要求を満たすことができるようになった。これは、手動の監視と比べて、コストを大幅に節約する。 High definition video has now become an important technological trend in the field of video surveillance. And cameras with 720p or 1080p resolution are more and more widely used in the field of video surveillance. With the continual increase in camera resolution, the range that a single camera can monitor has become larger and the image details have become clearer. At the same time, intelligent analysis techniques for video images are gradually being implemented. Technological developments in hardware devices have enabled hardware performance to meet the demands of performing intelligent analysis on multiple regions of interest within the same image. This saves significant costs compared to manual monitoring.

現在のビデオ監視クライアントは、一般に、同時に複数のカメラからのビデオ画像を再生する。しかし、ビデオ画像の解像度の増加によって、複数のカメラからのビデオ画像の総解像度は、しばしばクライアントのモニタの解像度の範囲を超える。例として２２インチディスプレイを取り上げると、このディスプレイは、一般に、１９２０×１０８０の最大解像度をサポートする。すなわち、このディスプレイは、1つのフィードから１０８０ｐの画像を再生することのみが可能である。１０８０ｐの画像の複数のフィードが同時にこのモニタ上で再生される場合、画像はズームアウトされなければならない。再生ウィンドウに加えて、典型的なビデオ監視クライアントのインターフェイス上には、例えばタイトルバー、カメラリスト、およびパン-チルト-ズーム制御パネルのような複数の補助機能パネルも存在し、これは更にビデオ画像の表示領域を減らす。従って、再生ウィンドウ内で再生され得る画像は、元の画像よりはるかに小さい。 Current video surveillance clients typically play back video images from multiple cameras at the same time. However, with increasing video image resolution, the total resolution of the video images from multiple cameras often exceeds the resolution range of the client's monitor. Taking a 22 inch display as an example, this display generally supports a maximum resolution of 1920×1080. That is, the display is only capable of reproducing 1080p images from one feed. If multiple feeds of 1080p images are played on this monitor at the same time, the images must be zoomed out. In addition to the playback window, there are also multiple auxiliary function panels on a typical video surveillance client interface, such as the title bar, camera list, and pan-tilt-zoom control panel, which further display the video image. Reduce the display area of . Therefore, the image that can be reproduced within the reproduction window is much smaller than the original image.

特に、イベント（例えば、インテリジェント分析によってトリガされるイベント）がビデオ画像内で発生する時、イベントが発生する画像の領域はさらに小さい。なぜなら、画像は、再生の間、ダウンサイズされるからであり、これはユーザが見るには不便である。
観察人員が裸眼で画像を監視する場合、観察人員が細部の変化に気付くことは難しく、結果として重要な情報を見逃す。 In particular, when an event (eg, an event triggered by intelligent analysis) occurs within a video image, the area of the image where the event occurs is even smaller. This is because the image is downsized during playback, which is inconvenient for the user to view.
When an observer monitors an image with the naked eye, it is difficult for the observer to notice changes in details, and as a result important information is missed.

現在では、大部分のクライアントは、画像の選択された領域にズームインする機能を提供する。すなわち、ビデオ再生画像の領域は、マウスをスライドさせることによって選択され、選択された領域はズームインされる。これにより、ある程度、対象領域の画質を改善する。しかし、ビデオ画像のデジタルズームは、いくらかのピクセル情報の損失を引き起こすので、画質に影響を及ぼし、さらに画像の細部についてのユーザ観察の効果に影響を及ぼす。加えて、選択された領域内でズームの機能が用いられる場合、一態様では、ユーザによる手動の操作が必要であり、イベントが突然発生する時、ユーザは、いかなる操作を実行する時間もなく、従ってイベントを見逃す。別の態様では、イベントが画像の別の領域内で発生する場合、同時にいくつかの領域にズームインすることは不可能である。従って、ユーザ経験は比較的劣っている。 Most clients today offer the ability to zoom in on a selected area of an image. That is, an area of the video playback image is selected by sliding the mouse and the selected area is zoomed in. This improves the image quality of the region of interest to some extent. However, digital zooming of video images causes loss of some pixel information, thus affecting image quality and further affecting the effectiveness of user observation of image details. In addition, if the function of zooming within the selected area is used, in one aspect, manual operation by the user is required, and when the event occurs suddenly, the user does not have time to perform any operation, thus miss an event. In another aspect, it is not possible to zoom in on several regions at the same time if the events occur in different regions of the image. Therefore, the user experience is relatively poor.

本発明の実施形態は、ビデオ再生方法、端末、およびシステムを提供し、これらはユーザ経験を改善することができる。 Embodiments of the present invention provide a video playing method, terminal and system, which can improve user experience.

第１の態様において、本発明の実施形態は、ビデオ再生方法を提供し、この方法は、元の再生画像を少なくとも２つの対象領域に分割するステップと、少なくとも２つの対象領域の中で、トリガイベントが発生する第１の対象領域を決定するステップと、第１の対象領域の中に表示される第１のビデオ画像の復号データを取得するステップと、第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに描画するステップとを有している。 In a first aspect, embodiments of the present invention provide a video playback method, comprising the steps of: dividing an original playback image into at least two regions of interest; determining a first region of interest in which an event occurs; obtaining decoded data of a first video image displayed in the first region of interest; obtaining decoded data of the first video image; and drawing to a designated playback window for playback.

第１の態様の第１の可能な実施方法において、この方法は、少なくとも２つの対象領域の中の各対象領域と、指定された再生ウィンドウとの間の対応関係を決定するステップを更に有していて、第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに描画するステップは、対応関係に従って、第１のビデオ画像の復号データを、再生のための第１の対象領域に対応する指定された再生ウィンドウに描画するステップを有している。 In a first possible implementation of the first aspect, the method further comprises determining a correspondence between each region of interest of the at least two regions of interest and the designated playback window. and rendering the decoded data of the first video image into a designated playback window for playback includes rendering the decoded data of the first video image to the first object for playback according to the correspondence relationship. There is the step of drawing in a designated playback window corresponding to the region.

第１の態様または第１の態様の第１の可能な実施方法に関連して、第１の態様の第２の可能な実施方法において、少なくとも２つの対象領域の中で、トリガイベントが発生する第１の対象領域を決定するステップは、元の再生画像の中の対象領域上で、ユーザによって実行されるトリガ動作を決定するステップを有していて、トリガ動作は、クリック動作、ダブルクリック動作、または対象領域を選択する動作を含み、更に、トリガ動作が実行される対象領域を第１の対象領域として決定するステップを有している。 In relation to the first aspect or the first possible implementation of the first aspect, in the second possible implementation of the first aspect, the triggering event occurs in at least two regions of interest Determining the first region of interest comprises determining a triggering action to be performed by the user on the region of interest in the original reproduced image, the triggering action being a click action, a double-click action. , or the act of selecting a region of interest, further comprising determining the region of interest on which the triggering action is to be performed as the first region of interest.

第１の態様または第１の態様の第１の可能な実施方法に関連して、第１の態様の第３の可能な実施方法において、少なくとも２つの対象領域の中で、トリガイベントが発生する第１の対象領域を決定するステップは、元の再生画像の中のトリガイベント発生ポイントの座標メタデータを取得するステップと、座標メタデータに従って、トリガイベント発生ポイントが属する対象領域を第１の対象領域として決定するステップとを有している。 In a third possible implementation of the first aspect, in relation to the first aspect or the first possible implementation of the first aspect, the triggering event occurs in at least two regions of interest The step of determining the first region of interest includes obtaining coordinate metadata of the trigger event occurrence point in the original reproduced image; and determining as a region.

第１の態様または第１の態様の第１から第３の可能な実施方法のいずれか一つに関連して、第１の態様の第４の可能な実施方法において、第１の対象領域の中に表示される第１のビデオ画像の復号データを取得するステップは、元の再生画像の復号データを取得するステップと、元の再生画像の復号データに従って、第１のビデオ画像の復号データを決定するステップとを有している。 In relation to the first aspect or any one of the first through third possible implementations of the first aspect, in the fourth possible implementation of the first aspect, the first region of interest obtaining the decoded data of the first video image displayed in the step of obtaining the decoded data of the original reproduced image; and obtaining the decoded data of the first video image according to the decoded data of the original reproduced image. and determining.

第１の態様または第１の態様の第１から第４の可能な実施方法のいずれか一つに関連して、第１の態様の第５の可能な実施方法において、第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに描画するステップは、第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに、ズームイン方法で描画するステップを有していて、指定された再生ウィンドウは、第１の対象領域より大きい。 In relation to the first aspect or any one of the first through fourth possible implementations of the first aspect, in the fifth possible implementation of the first aspect, Rendering the decoded data into the designated playback window for playback comprises rendering the decoded data of the first video image into the designated playback window for playback in a zoomed-in manner. Thus, the specified playback window is larger than the first region of interest.

第１の態様または第１の態様の第１から第５の可能な実施方法のいずれか一つに関連して、第１の態様の第６の可能な実施方法において、第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに描画するステップは、独立した再生ウィンドウをポップアップするステップと、第１のビデオ画像の復号データを、再生のための独立した再生ウィンドウに描画するステップとを有している。 In relation to the first aspect or any one of the first through fifth possible implementations of the first aspect, in the sixth possible implementation of the first aspect, Rendering the decoded data into a designated playback window for playback comprises popping up a separate playback window and rendering the decoded data of the first video image into a separate playback window for playback. and a step.

第２の態様において、本発明の実施形態は、ビデオ再生端末を提供し、この端末は、元の再生画像を少なくとも２つの対象領域に分割するように構成された分割モジュールと、分割モジュールによって選び出された少なくとも２つの対象領域の中で、トリガイベントが発生する第１の対象領域を決定するように構成された第１の決定モジュールと、第１の決定モジュールによって決定された第１の対象領域の中に表示される第１のビデオ画像の復号データを取得するように構成された取得モジュールと、取得モジュールによって取得された第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに描画するように構成された再生モジュールとを備えている。 In a second aspect, embodiments of the present invention provide a video playback terminal, the terminal comprising a segmentation module configured to segment an original playback image into at least two regions of interest, and a segmentation module selected by the segmentation module. a first determining module configured to determine a first region of interest in which a triggering event occurs among the at least two regions of interest issued; and a first target determined by the first determining module. an acquisition module configured to acquire decoded data of a first video image displayed in the region; a playback module configured to draw to the playback window.

第２の態様の第１の可能な実施方法において、この端末は、少なくとも２つの対象領域の中の各対象領域と、指定された再生ウィンドウとの間の対応関係を決定するように構成された第２の決定モジュールを更に備えていて、再生モジュールは、第２の決定モジュールによって決定された対応関係に従って、取得モジュールによって取得された第１のビデオ画像の復号データを、再生のための第１の対象領域に対応する指定された再生ウィンドウに描画するように更に構成されている。 In a first possible implementation of the second aspect, the terminal is configured to determine correspondence between each target area of the at least two target areas and a specified playback window. Further comprising a second determining module, the reproducing module converts the decoded data of the first video image obtained by the obtaining module to the first video image for reproduction according to the correspondence determined by the second determining module. is further configured to draw to a specified playback window corresponding to the region of interest of the

第２の態様または第２の態様の第１の可能な実施方法に関連して、第２の態様の第２の可能な実施方法において、第１の決定モジュールは、元の再生画像の中の対象領域上で、ユーザによって実行されるトリガ動作を決定するように構成された第１の決定ユニットを有していて、トリガ動作は、クリック動作、ダブルクリック動作、または対象領域を選択する動作を含み、更に、第１の決定ユニットによって決定されたトリガ動作が実行される対象領域を第１の対象領域として決定するように構成された第２の決定ユニットを有している。 Relating to the second aspect or the first possible implementation of the second aspect, in the second possible implementation of the second aspect, the first determining module comprises: a first determining unit configured to determine a triggering action performed by a user on the target area, the triggering action being a clicking action, a double-clicking action, or an action of selecting the target area; and further comprising a second determining unit configured to determine, as the first target area, a target area in which the triggering action determined by the first determining unit is to be performed.

第２の態様または第２の態様の第１の可能な実施方法に関連して、第２の態様の第３の可能な実施方法において、第１の決定モジュールは、元の再生画像の中のトリガイベント発生ポイントの座標メタデータを取得するように構成された第１の取得ユニットと、第１の取得ユニットによって取得された座標メタデータに従って、トリガイベント発生ポイントが属する対象領域を第１の対象領域として決定するように構成された第３の決定ユニットとを有している。 In relation to the second aspect or the first possible implementation of the second aspect, in the third possible implementation of the second aspect, the first determining module comprises: a first acquisition unit configured to acquire coordinate metadata of the trigger event occurrence point; and determining a region of interest to which the trigger event occurrence point belongs to the first target according to the coordinate metadata acquired by the first acquisition unit. and a third determining unit configured to determine as a region.

第２の態様または第２の態様の第１から第３の可能な実施方法のいずれか一つに関連して、第２の態様の第４の可能な実施方法において、取得モジュールは、元の再生画像の復号データを取得するように構成された第２の取得ユニットと、第２の取得ユニットによって取得された元の再生画像の復号データに従って、第１のビデオ画像の復号データを決定するように構成された第３の決定ユニットとを有している。 In a fourth possible implementation of the second aspect, in conjunction with the second aspect or any one of the first through third possible implementations of the second aspect, the acquisition module comprises: a second acquisition unit configured to obtain decoded data of the reconstructed image; and for determining decoded data of the first video image according to the decoded data of the original reconstructed image acquired by the second acquisition unit. and a third decision unit configured as:

第２の態様または第２の態様の第１から第４の可能な実施方法のいずれか一つに関連して、第２の態様の第５の可能な実施方法において、再生モジュールは、第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに、ズームイン方法で描画するように更に構成されていて、指定された再生ウィンドウは、第１の対象領域より大きい。 In a fifth possible implementation of the second aspect, in relation to the second aspect or any one of the first through fourth possible implementations of the second aspect, the regeneration module comprises: is further configured to render the decoded data of the video image in a specified playback window for playback in a zoomed-in manner, the specified playback window being larger than the first region of interest.

第２の態様または第２の態様の第１から第５の可能な実施方法のいずれか一つに関連して、第２の態様の第６の可能な実施方法において、再生モジュールは、独立した再生ウィンドウを表示するように構成された表示ユニットと、第１のビデオ画像の復号データを、再生のための表示ユニットによってポップアップされる独立した再生ウィンドウに描画するように構成された再生ユニットとを有している。 In relation to the second aspect or any one of the first to fifth possible implementations of the second aspect, in the sixth possible implementation of the second aspect, the regeneration module comprises an independent a display unit configured to display a playback window; and a playback unit configured to render decoded data of the first video image into a separate playback window popped up by the display unit for playback. have.

第３の態様において、本発明の実施形態は、ビデオ再生システムを提供し、このシステムは、本発明の第２の態様による端末と、ビデオ画像をキャプチャして、このビデオ画像をエンコードすることによってメディアストリームを生成するように構成されたビデオキャプチャシステムと、ビデオキャプチャシステムによって生成されたメディアストリームを取得して、このメディアストリームを端末に供給するように構成されたサーバと、サーバによって取得されたメディアストリームを記憶するように構成された記憶装置とを備えていて、端末は、元の再生画像を少なくとも２つの対象領域に分割するように構成された分割モジュールと、分割モジュールによって選び出された少なくとも２つの対象領域の中で、トリガイベントが発生する第１の対象領域を決定するように構成された第１の決定モジュールと、第１の決定モジュールによって決定された第１の対象領域の中に表示される第１のビデオ画像の復号データを取得するように構成された取得モジュールと、取得モジュールによって取得された第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに描画するように構成された再生モジュールとを備えている。 In a third aspect, embodiments of the present invention provide a video playback system comprising a terminal according to the second aspect of the present invention and a video image captured by capturing and encoding the video image. a video capture system configured to generate a media stream; a server configured to obtain the media stream generated by the video capture system and provide the media stream to a terminal; a storage device configured to store a media stream, the terminal comprising: a segmentation module configured to segment an original reproduced image into at least two target regions; a first determining module configured to determine a first region of interest in which the triggering event occurs among the at least two regions of interest; and within the first region of interest determined by the first determining module. an acquisition module configured to acquire decoded data of a first video image displayed in a specified playback window for playback; a playback module configured to render.

前述の技術的解決策に基づいて、本発明の実施形態によるビデオ再生方法、端末、およびシステムを用いて、元の再生画像は、複数の対象領域に分割され、トリガイベントが発生する対象領域の中の画像は、別に表示される。従って、一態様では、ユーザは、対象領域の中で、より明確な画像の細部を観察することができ、他の態様では、ユーザは、同時に複数の対象領域の中で、画像の細部を追跡することができ、これによりユーザ経験を大幅に向上させる。 Based on the above technical solution, with the video playback method, terminal and system according to the embodiments of the present invention, the original playback image is divided into a plurality of target regions, and the target region where the trigger event occurs is Images inside are displayed separately. Thus, in one aspect, the user is able to observe more distinct image detail within the regions of interest, and in another aspect, the user is able to track image detail within multiple regions of interest simultaneously. can be used, which greatly improves the user experience.

本発明の実施形態における技術的解決策をより明確に説明するために、以下で簡潔に、本発明の実施形態を説明するために必要な添付図面を紹介する。明らかに、以下の説明の中の添付図面は、本発明のいくつかの実施形態を示すのみであり、当業者は、創造的な努力なしで、これらの添付図面から他の図面を更に導き出すことができる。 To describe the technical solutions in the embodiments of the present invention more clearly, the following briefly introduces the accompanying drawings required for describing the embodiments of the present invention. Apparently, the accompanying drawings in the following description only show some embodiments of the present invention, and those skilled in the art can further derive other drawings from these accompanying drawings without creative efforts. can be done.

ビデオ監視の分野におけるビデオ再生方法、端末、およびシステムを提供する。 A video playback method, terminal and system in the field of video surveillance are provided.

本発明の一実施形態による、例示的アプリケーションシナリオの概略構成図である。1 is a schematic block diagram of an exemplary application scenario, according to an embodiment of the invention; FIG. 本発明の一実施形態による、ビデオ再生方法の概略フロー図である。1 is a schematic flow diagram of a video playback method according to an embodiment of the present invention; FIG. 本発明の一実施形態によるビデオ再生方法の他の概略フロー図である。Fig. 4 is another schematic flow diagram of a video playing method according to an embodiment of the present invention; 本発明の一実施形態による、元の再生画像を対象領域に分割する方法の概略フロー図である。Figure 4 is a schematic flow diagram of a method for segmenting an original reconstructed image into regions of interest, according to an embodiment of the present invention; 本発明の一実施形態による、元の再生画像を対象領域に分割する方法の他の概略フロー図である。Fig. 3 is another schematic flow diagram of a method for segmenting an original reconstructed image into regions of interest according to an embodiment of the present invention; 本発明の一実施形態による、トリガイベントが発生する対象領域を決定する方法の概略フロー図である。1 is a schematic flow diagram of a method for determining a region of interest in which a trigger event occurs, according to one embodiment of the invention; FIG. 本発明の一実施形態による、トリガイベントが発生する対象領域を決定する方法の他の概略フロー図である。FIG. 5 is another schematic flow diagram of a method for determining a region of interest in which a trigger event occurs, according to an embodiment of the invention; 本発明の一実施形態による、対象領域の復号データを取得する方法の概略フロー図である。FIG. 4 is a schematic flow diagram of a method for obtaining decoded data for a region of interest, according to an embodiment of the present invention; 本発明の一実施形態による、対象領域の中の画像を再生する方法の概略フロー図である。1 is a schematic flow diagram of a method for reconstructing an image within a region of interest, according to one embodiment of the present invention; FIG. 本発明の他の実施形態によるビデオ再生方法概略フロー図である。FIG. 4 is a schematic flow diagram of a video playing method according to another embodiment of the present invention; 本発明の他実施形態によるビデオ再生方法の他の概略フロー図である。FIG. 4 is another schematic flow diagram of a video playing method according to another embodiment of the present invention; 本発明の他実施形態によるビデオ再生方法の他の概略フロー図である。FIG. 4 is another schematic flow diagram of a video playing method according to another embodiment of the present invention; 本発明の一実施形態による、対象領域の再生の概略図である。FIG. 4 is a schematic diagram of playing a region of interest according to an embodiment of the present invention; 本発明の一実施形態による、対象領域の再生の概略図である。FIG. 4 is a schematic diagram of playing a region of interest according to an embodiment of the present invention; 本発明の一実施形態による、端末の概略ブロック図である。1 is a schematic block diagram of a terminal according to an embodiment of the present invention; FIG. 本発明の一実施形態による、端末の他の概略ブロック図である。FIG. 4 is another schematic block diagram of a terminal according to an embodiment of the present invention; 本発明の一実施形態による第１の決定モジュール概略ブロック図である。Fig. 2 is a first decision module schematic block diagram according to an embodiment of the present invention; 本発明の一実施形態による、第１の決定モジュールの他の概略ブロック図である。FIG. 2B is another schematic block diagram of the first decision module, according to an embodiment of the present invention; 本発明の一実施形態による、取得モジュール概略ブロック図である。FIG. 4 is a schematic block diagram of an acquisition module, according to one embodiment of the present invention; 本発明の一実施形態による、再生モジュール概略ブロック図である。FIG. 4 is a schematic block diagram of a regeneration module, according to one embodiment of the present invention; 本発明の一実施形態による、システム概略ブロック図である。1 is a system schematic block diagram, according to one embodiment of the present invention; FIG. 本発明の他の実施形態による、端末概略ブロック図である。FIG. 4 is a schematic block diagram of a terminal according to another embodiment of the present invention;

以下、本発明の実施形態の中の添付図面に関連して、本発明の実施形態の中の技術的解決策を明確かつ完全に記載する。明らかに、記載されている実施形態は、本発明の実施形態の全てではなく、一部のみである。本発明の実施形態に基づいて、創造的な努力なしで当業者によって得られる他の全ての実施形態は、本発明の保護範囲に入るものとする。 The following clearly and completely describes the technical solutions in the embodiments of the present invention with reference to the accompanying drawings in the embodiments of the present invention. Apparently, the described embodiments are only a part rather than all of the embodiments of the present invention. All other embodiments obtained by persons skilled in the art without creative efforts based on the embodiments of the present invention shall fall within the protection scope of the present invention.

図１は、本発明の一実施形態による、例示的アプリケーションシナリオの概略構成図である。図１に示すように、本発明の実施形態が適用可能なビデオ監視システムは、ビデオキャプチャ装置、中央サーバ、記憶装置、およびクライアントを有する端末を含み、ビデオキャプチャ装置は、ビデオ画像をキャプチャするために用いることができ、かつビデオ画像をエンコードすることによって、メディアストリームを生成することができ、ネットワーク上にビデオ画像を送信する。例えば、ビデオキャプチャ装置は、例えばネットワークカメラ、アナログカメラ、エンコーダ、およびデジタルビデオレコーダ（digital video recorder、略してＤＶＲ）のような装置を含み得る。中央サーバに接続された後に、端末のクライアントは、ビデオストリームを要求し、このビデオストリームを復号化して表示し、かつオンサイトビデオ画像をユーザに提示することができる。 FIG. 1 is a schematic block diagram of an exemplary application scenario, according to one embodiment of the present invention. As shown in FIG. 1, a video surveillance system to which embodiments of the present invention are applicable includes a terminal having a video capture device, a central server, a storage device, and a client, the video capture device for capturing video images. and can generate a media stream by encoding the video images to transmit the video images over a network. For example, video capture devices may include devices such as network cameras, analog cameras, encoders, and digital video recorders (DVRs for short). After being connected to the central server, the client of the terminal can request the video stream, decode and display this video stream, and present the on-site video image to the user.

中央サーバは、管理サーバおよびメディアサーバを含み得る。メディアサーバは、メディアストリームを受信し、メディアストリームのデータを記憶装置に記録して保存し、かつオンデマンドで再生のためにクライアントにメディアストリームを転送する役割を果たし得る。管理サーバは、例えばユーザログイン、認証、およびサービススケジューリングのような機能の役割を果たし得る。中央サーバは、また、複数のクライアントによってアクセスされ、様々なビデオ監視システム間のネットワーク接続の管理等が可能である。記憶装置は、例えば、ディスクアレイであってもよい。ディスクアレイは、ビデオデータを記憶する役割を果たすことができ、ビデオデータを記憶するために、ネットワーク接続ストレージ（network attached storage、略してＮＡＳ）、記憶エリアネットワーク（storage area network、略してＳＡＮ）、またはサーバ自体を用いることができる。 Central servers may include management servers and media servers. A media server may be responsible for receiving media streams, recording and storing media stream data in storage devices, and forwarding media streams to clients for playback on demand. A management server may serve functions such as user login, authentication, and service scheduling, for example. The central server may also be accessed by multiple clients, such as managing network connections between various video surveillance systems. The storage device may be, for example, a disk array. Disk arrays can serve to store video data, and can be network attached storage (NAS for short), storage area network (SAN for short), Or the server itself can be used.

図１に示したビデオ監視システムは、本発明の方法が適用可能な実施形態に過ぎず、本発明の目的および機能を限定することを意図しているのではないことは理解されるべきである。本発明は、図１に示したビデオ監視システムの構成要素のいかなる一つまたはいかなる組み合わせに関する要求に従属するものとして説明されるべきではない。しかし、本発明をより明確に詳しく述べるために、本発明の実施形態は、例としてビデオ監視システムのアプリケーションシナリオをとることによって、以下で説明される。しかし、本発明は、これに限定されるわけではない。 It should be understood that the video surveillance system shown in FIG. 1 is merely an embodiment to which the method of the present invention can be applied and is not intended to limit the purpose and functionality of the present invention. . The present invention should not be described as dependent on any one or any combination of the components of the video surveillance system shown in FIG. However, in order to elaborate the invention more clearly, embodiments of the invention are described below by taking the application scenario of a video surveillance system as an example. However, the invention is not so limited.

本発明の実施形態におけるビデオデータ送信のための技術的解決策は、様々な通信ネットワークまたは通信システム、例えば、モバイル通信のためのグローバルシステム（Global System for Mobile Communications、略して"ＧＳＭ(登録商標)"）、符号分割多重アクセス（Code Division Multiple Access、略して"ＣＤＭＡ"）システム、広帯域符号分割多重アクセス（Wideband Code Division Multiple Access、略して"ＷＣＤＭＡ(登録商標)"）システム、汎用パケット無線サービス（general packet radio service、略して"ＧＰＲＳ"）システム、ロング・ターム・エボリューション（Long Term Evolution、略して"ＬＴＥ"）システム、ＬＴＥ周波数分割デュプレックス（frequency division duplex、略して"ＦＤＤ"）システム、ＬＴＥ時間分割デュプレックス（time division duplex、略して"ＴＤＤ"）システム、ユニバーサル移動体通信システム（Universal Mobile Telecommunication System、略して"ＵＭＴＳ"）、またはワールドワイド・インターオペラビリティ・フォー・マイクロウェーブ・アクセス（Worldwide Interoperability for Microwave Access、略して"ＷｉＭＡＸ"）通信システムを用いることができることは理解されるべきである。本発明の実施形態は、これらに限定されるわけではない。 The technical solutions for video data transmission in the embodiments of the present invention can be applied to various communication networks or communication systems, such as the Global System for Mobile Communications ("GSM" for short). "), Code Division Multiple Access (abbreviated as "CDMA") systems, Wideband Code Division Multiple Access (abbreviated as "WCDMA") systems, general packet radio services ( general packet radio service ("GPRS" for short) system, Long Term Evolution ("LTE" for short) system, LTE frequency division duplex ("FDD" for short) system, LTE time Time division duplex ("TDD" for short) system, Universal Mobile Telecommunication System ("UMTS" for short) or Worldwide Interoperability for Microwave Access It should be understood that a Microwave Access, abbreviated "WiMAX") communication system may be used. Embodiments of the invention are not limited to these.

図２は、本発明の一実施形態による、ビデオ再生方法１００の概略フローチャートである。
この方法１００は、ビデオ再生装置によって実行され得る。この装置は、例えば、端末またはクライアントである。図２に示すように、この方法１００は、以下のステップを有している。
Ｓ１１０: 元の(original)再生画像を少なくとも２つの対象領域に分割する。
Ｓ１２０: 少なくとも２つの対象領域の中で、トリガイベントが発生する第１の対象領域を決定する。
Ｓ１３０: 第１の対象領域の中に表示される第１のビデオ画像の復号データを取得する。
Ｓ１４０: 第１のビデオ画像の復号データを再生のための指定された再生ウィンドウに描画する。 FIG. 2 is a schematic flow chart of a video playback method 100, according to one embodiment of the invention.
This method 100 may be performed by a video playback device. This device is for example a terminal or a client. As shown in FIG. 2, the method 100 has the following steps.
S110: Divide the original reconstructed image into at least two regions of interest.
S120: Determining a first region of interest in which a trigger event occurs among the at least two regions of interest.
S130: Obtain decoded data of the first video image displayed in the first region of interest.
S140: Draw the decoded data of the first video image into the designated playback window for playback.

再生ビデオの画質に影響を及ぼさないために、かつ画像の細部についてのユーザ観察の効果を改善するために、特に、複数のビデオ画像がズームアウト方法で同じウィンドウ内に表示される場合には、ビデオ再生装置は、最初に、元の再生画像を複数の対象領域に分割することができる。そして、トリガイベントが発生する対象領域の中に表示されるビデオ画像の復号データを得る。従って、ビデオ再生装置は、ビデオ画像の復号データを、再生のための独立した指定された再生ウィンドウに描画することができる。このような方法で、ユーザが興味を持っている画像の細部が、独立したウィンドウに表示され得る。そして、画像の細部についてのユーザ観察の効果が改善される。 In order not to affect the quality of the reproduced video and to improve the effect of user observation on image details, especially when multiple video images are displayed in the same window in a zoomed-out manner: The video playback device may first divide the original playback image into multiple regions of interest. Then, decoded data of the video image displayed in the region of interest where the trigger event occurs is obtained. Accordingly, the video playback device can render the decoded data of the video image into a separate designated playback window for playback. In this way, image details that are of interest to the user can be displayed in separate windows. And the effect of user observation on image details is improved.

従って、本発明の実施形態によるビデオ再生方法は、元の再生画像を複数の対象領域に分割して、トリガイベントが発生する対象領域の中に画像を別に表示する。従って、一態様では、ユーザは、対象領域の中でより明確な画像の細部を観察することができ、別の態様では、ユーザは、同時に複数の対象領域における画像の細部を追跡することができ、これによりユーザ経験を大幅に向上させる。 Therefore, the video playback method according to an embodiment of the present invention divides the original playback image into a plurality of target regions and separately displays the images in the target region where the trigger event occurs. Thus, in one aspect, the user is able to observe more distinct image detail within regions of interest, and in another aspect, the user is able to track image detail in multiple regions of interest simultaneously. , which greatly improves the user experience.

本発明の実施形態において、ビデオは、ビデオファイルを含むだけではなく、リアルタイムビデオストリームも含むことは理解されるべきである。本発明の実施形態は、例としてリアルタイムビデオストリームの再生を取り上げることによって記載されているが、本発明の実施形態は、これに限定されるわけではない。 It should be understood that in embodiments of the present invention, video not only includes video files, but also real-time video streams. Embodiments of the present invention are described by taking playback of a real-time video stream as an example, but embodiments of the present invention are not so limited.

本発明の実施形態において、オプションとして、図３に示すように、方法100は、以下のステップを更に有している。
Ｓ１５０: 少なくとも２つの対象領域の中の各対象領域と、指定された再生ウィンドウとの間の対応関係を決定する。 In an embodiment of the present invention, optionally as shown in FIG. 3, method 100 further comprises the following steps.
S150: Determine the correspondence between each target region of the at least two target regions and the designated playback window.

第１のビデオ画像の復号化データを、再生のための指定された再生ウィンドウに描画するステップは、以下のステップを有している。
Ｓ１４１: 対応関係に従って、第１のビデオ画像の復号データを、再生のための第１の対象領域に対応する指定された再生ウィンドウに描画するステップ。 Rendering the decoded data of the first video image into a designated playback window for playback comprises the steps of: a.
S141: Rendering the decoded data of the first video image into the specified playback window corresponding to the first target area for playback according to the correspondence.

すなわち、各対象領域は、トリガイベントが対象領域内で発生する時に、対象領域内の画像を再生するために、１つ以上の再生ウィンドウと関連し得る。指定されたウィンドウは、表示装置上の最大の再生ウィンドウであってもよく、かつ最大の再生ウィンドウの一部であってもよい。指定されたウィンドウは、現在の既存の再生ウィンドウまたは既存の再生ウィンドウの一部であってもよく、新しいポップアップ再生ウィンドウまたは新しく生成された再生ウィンドウであってもよい。本発明の実施形態は、これらに限定されるわけではない。 That is, each region of interest may be associated with one or more playback windows for playing back images within the region of interest when a triggering event occurs within the region of interest. The specified window may be the largest playback window on the display device and may be part of the largest playback window. The specified window may be a current existing playback window or part of an existing playback window, or may be a new pop-up playback window or a newly created playback window. Embodiments of the invention are not limited to these.

図４から図１２Ｂを参照して、以下、本発明の一実施形態によるビデオ再生方法を詳細に説明する。 Referring to FIGS. 4 to 12B, the video playing method according to an embodiment of the present invention will now be described in detail.

Ｓ１１０において、オプションとして、元の再生画像を少なくとも2つの対象領域に分割するステップは、元の再生画像を、等しい分割方法または自由な分割方法で、少なくとも2つの対象領域に分割するステップを有している。 In S110, optionally dividing the original reproduced image into at least two target regions comprises dividing the original reproduced image into at least two target regions in an equal division method or a free division method. ing.

具体的には、単一の再生ウィンドウは、クライアント上で前もって複数の対象領域に分割され得る。対象領域のサイズは、同じまたは異なっていてもよく、対象領域は、不規則な領域として設定されてもよい。加えて、本発明の実施形態において、対象領域と再生ウィンドウとの間の対応関係が決定され得る。元の再生画像を対象領域に分割するステップは、手動でユーザによって実行されてもよく、またクライアントソフトウェアによって自動的に構成されてもよく、この構成はクライアント上に保存される。 Specifically, a single playback window may be pre-divided into multiple regions of interest on the client. The size of the regions of interest may be the same or different, and the regions of interest may be set as irregular regions. Additionally, in embodiments of the present invention, correspondences between regions of interest and playback windows may be determined. The step of dividing the original reproduced image into regions of interest may be performed manually by the user or automatically configured by the client software, which configuration is saved on the client.

画像は、等しい分割方法または自由な分割方法で分割され得る。具体的な構成プロセスが図４および図５に示されている。例えば、図４に示すように、画像を等しい分割方法で対象領域に分割する方法は、以下のステップを有している。
Ｓ１１１: 右クリックメニューまたはツールバーボタンをクリックして、構成ウィンドウを表示する。
Ｓ１１２: ポップアップ構成ウィンドウ内で、対象領域の数を設定する。例えば、対象領域の数を１６に設定する。
Ｓ１１３: 対象領域上で右クリックして、対象領域に結合された再生ウィンドウを設定する。
Ｓ１１４: トリガイベントが対象領域内で発生する時に、再生ウィンドウを選択して、対象領域内のビデオを再生する。 The image can be split with an equal split method or a free split method. A specific configuration process is shown in FIGS. 4 and 5. FIG. For example, as shown in FIG. 4, a method for dividing an image into regions of interest with an equal division method comprises the following steps.
S111: Click the right-click menu or toolbar button to display the configuration window.
S112: Set the number of target regions in the pop-up configuration window. For example, set the number of regions of interest to sixteen.
S113: Right-click on the target area to set a playback window combined with the target area.
S114: Select a playback window to play the video in the target region when the trigger event occurs within the target region.

図５に示すように、画像を自由な分割方法で対象領域に分割する方法は、例えば、以下のステップを有し得る。
Ｓ１１５: 右クリックメニューまたはツールバーボタンをクリックして、構成ウィンドウをポップアップする。
Ｓ１１６: ポップアップ構成ウィンドウ内で、マウスをドラッグして、対象領域を描く。対象領域のサイズおよび形状は、同じまたは異なっていてもよい。
Ｓ１１７: 対象領域上で右クリックして、対象領域に結合された再生ウィンドウを設定する。
Ｓ１１８: トリガイベントが対象領域内で発生する時に、再生ウィンドウを選択して、対象領域内のビデオを再生する。 As shown in FIG. 5, a method for dividing an image into target regions by a free dividing method may include, for example, the following steps.
S115: Click the right-click menu or toolbar button to pop up a configuration window.
S116: In the pop-up configuration window, drag the mouse to draw the target area. The size and shape of the regions of interest may be the same or different.
S117: Right-click on the target area to set a playback window combined with the target area.
S118: Select a play window to play the video in the target region when the trigger event occurs within the target region.

本発明の実施形態において、対象領域に分割される元の再生画像は、表示装置上の最大の再生ウィンドウ内の全再生画像であってもよいし、最大の再生ウィンドウ内で同時に再生される複数の画像の中の一つ以上の画像であってもよいことは理解されるべきである。
本発明の実施形態は、これらに限定されるわけではない。 In embodiments of the present invention, the original playback image that is divided into regions of interest may be the entire playback image within the largest playback window on the display device, or multiple playback images that are played simultaneously within the largest playback window. It should be understood that there may be one or more images in the image of .
Embodiments of the invention are not limited to these.

Ｓ１２０において、ビデオ再生装置は、少なくとも２つの対象領域の中で、トリガイベントが発生する第１の対象領域を決定して、別の再生ウィンドウの中で第１の対象領域内の画像を表示し、これにより画像細部の表示効果を改善する。 At S120, the video playback device determines a first target region in which the trigger event occurs among the at least two target regions, and displays an image within the first target region in another playback window. , thereby improving the display effect of image details.

本発明の実施形態において、ユーザは、手動でイベントをトリガして、対象領域を決定することもできるし、自動的に生成されたトリガイベントを検出して、対象領域を決定することもできる。これらは、以下で、それぞれ図６および図７を参照して説明される。 In embodiments of the present invention, the user can manually trigger events to determine the region of interest, or detect automatically generated trigger events to determine the region of interest. These are described below with reference to FIGS. 6 and 7, respectively.

図６に示すように、オプションとして、少なくとも２つの対象領域の中で、トリガイベントが発生する第１の対象領域を決定するステップは、以下のステップを有している。
Ｓ１２１: 元の再生画像内の対象領域上でユーザによって実行されるトリガ動作を決定する。
トリガ動作は、クリック動作、ダブルクリック動作、または対象領域を選択する動作を含む。
Ｓ１２２: トリガ動作が第１の対象領域として実行される対象領域を決定する。 Optionally, as shown in FIG. 6, the step of determining, among the at least two regions of interest, a first region of interest in which the triggering event occurs comprises the steps of: a.
S121: Determining the triggering action to be performed by the user on the region of interest in the original reproduced image.
Triggering actions include clicking, double-clicking, or selecting an area of interest.
S122: Determining the region of interest on which the trigger action is to be performed as the first region of interest.

具体的には、ユーザがビデオを見る時に、イベントの発生を検出する場合、ユーザは、クライアントインターフェイス上で動作することができる。例えば、元の再生画像内の対象領域上でトリガ動作を実行して、これにより、イベントが発生する対象領域内の画像は、予め指定された再生ウィンドウ内で再生されるか、または独立したポップアップ再生ウィンドウ内に表示される。イベントが複数の対象領域内で発生する時、ユーザは、表示のための複数のウィンドウをトリガすることができる。トリガ動作は、例えば、クリック動作、ダブルクリック動作、または対象領域を選択する動作である。本発明の実施形態は、これらに限定されるわけではない。 Specifically, when a user watches a video and detects the occurrence of an event, the user can act on the client interface. For example, by performing a triggering action on a region of interest within the original playback image, such that the image within the region of interest where the event occurs is played back within a pre-designated playback window, or popped up independently. Displayed in the playback window. When events occur in multiple regions of interest, the user can trigger multiple windows for display. A trigger action is, for example, a click action, a double-click action, or an action to select a region of interest. Embodiments of the invention are not limited to these.

図７は、本発明の実施形態による、トリガイベントが発生する対象領域を決定する方法の他の概略フローチャートを示している。図７に示すように、オプションとして、少なくとも２つの対象領域の中で、トリガイベントが発生する第１の対象領域を決定するステップは、以下のステップを有している。
Ｓ１２３: 元の再生画像内のトリガイベント発生ポイントの座標メタデータを取得する。
Ｓ１２４: 座標メタデータに従って、トリガイベント発生ポイントが属する対象領域を第1の対象領域として決定する。 FIG. 7 shows another schematic flow chart of a method for determining a region of interest in which a trigger event occurs, according to an embodiment of the invention. Optionally, as shown in FIG. 7, the step of determining, among the at least two regions of interest, a first region of interest in which the triggering event occurs comprises the steps of: a.
S123: Acquire the coordinate metadata of the trigger event occurrence point in the original reproduced image.
S124: Determine the target region to which the trigger event occurrence point belongs as the first target region according to the coordinate metadata.

具体的には、例えば、ユーザは、自動イベント検出を必要とする領域を予め構成することができ、かつイベント検出規則、例えば、動き検出またはインテリジェント分析検出を構成することができる。イベントが発生する時、クライアントソフトウェアは、トリガイベント発生ポイントの座標メタデータに従って、前もって構成された対応する対象領域を決定することができるので、対応する画像は、予め指定された再生ウィンドウで再生されるか、または独立したポップアップ再生ウィンドウに表示される。イベントが複数の対象領域で発生する時、クライアントソフトウェアは、表示のための複数のウィンドウを起動することができる。 Specifically, for example, a user can pre-configure regions that require automatic event detection and configure event detection rules, such as motion detection or intelligent analysis detection. When an event occurs, the client software can determine a corresponding pre-configured region of interest according to the coordinate metadata of the trigger event occurrence point, so that the corresponding image is played back in the pre-designated playback window. or displayed in a separate pop-up playback window. When events occur in multiple regions of interest, the client software can launch multiple windows for display.

不規則なインテリジェント分析領域の中では、トリガイベントは、複数の対象領域をカバーし得るし、この時、複数の対象領域は、トリガイベントが発生する第１の対象領域として決定され得ることは理解されるべきである。本発明の実施形態は、これに限定されるわけではない。 It will be appreciated that within an irregular intelligent analysis region, a trigger event may cover multiple regions of interest, and at this time, multiple regions of interest may be determined as the first region of interest in which the trigger event occurs. It should be. Embodiments of the invention are not so limited.

本発明の実施形態において、ビデオ再生装置は、動き検出、インテリジェント分析検出などによって、トリガイベントが対象領域内で発生するかどうかを判定することができる。中央サーバも、検出を実行して、トリガイベントが対象領域内で発生するかどうかを決定することができる。そして、トリガイベントが検出される時、中央サーバは、トリガイベント発生ポイントの座標メタデータを、ビデオ再生装置にフィードバックすることができるので、ビデオ再生装置は、座標メタデータに従って、トリガイベントが発生する第１の対象領域を決定することができる。本発明の実施形態は、これに限定されるわけではない。 In embodiments of the present invention, a video playback device can determine whether a trigger event occurs within a region of interest by motion detection, intelligent analysis detection, or the like. A central server can also perform detection to determine if a triggering event occurs within the region of interest. And when the trigger event is detected, the central server can feed back the coordinate metadata of the trigger event occurrence point to the video playback device, so that the video playback device can generate the trigger event according to the coordinate metadata. A first region of interest can be determined. Embodiments of the invention are not so limited.

Ｓ１３０において、ビデオ再生装置は、第１の対象領域内に表示される第１のビデオ画像の復号データを得るので、第１のビデオ画像は、指定された再生ウィンドウで再生される。
本発明の実施形態において、オプションとして、図８に示すように、第１の対象領域に表示される第１のビデオ画像の復号データを取得するステップは、以下のステップを有している。
Ｓ１３１: 元の再生画像の復号データを取得する。
Ｓ１３２: 元の再生画像の復号データに従って、第１のビデオ画像の復号データを決定する。 At S130, the video playback device obtains the decoded data of the first video image displayed within the first region of interest so that the first video image is played back in the designated playback window.
Optionally, in an embodiment of the present invention, as shown in Figure 8, obtaining decoded data of a first video image to be displayed in the first region of interest comprises the steps of: a.
S131: Acquire the decoded data of the original reproduced image.
S132: Determining the decoded data of the first video image according to the decoded data of the original reproduced image.

具体的には、例えば、ビデオ再生装置は、クリック、ダブルクリック、ツールバーボタンのクリック、またはショートカットキーの方法で、ユーザによって手動でトリガされるイベントを受信する。この装置は、対象領域に属しているデータの内容を、元の再生ウィンドウの復号されたＹＵＶデータから遮断して、予め構成された対応関係に従って、この内容の一部を、予め指定された再生枠内で再生する（または、独立した再生ウィンドウを表示して、この一部を再生する）ことができる。複数の再生ウィンドウは、同じＹＵＶデータソースを用いる。従って、この装置は、追加の複数のビデオストリームを持ち込むかまたは加える必要はない。 Specifically, for example, the video playback device receives events manually triggered by the user in the manner of clicks, double-clicks, toolbar button clicks, or shortcut keys. The apparatus intercepts the content of the data belonging to the region of interest from the decoded YUV data of the original playback window and, according to a preconfigured correspondence, replaces a portion of this content with the prespecified playback. It can be played within a frame (or a separate playback window can be displayed to play part of it). Multiple playback windows use the same YUV data source. Therefore, the device does not need to bring or add additional multiple video streams.

例えば、元の再生画像の解像度が、Width×Heightであると仮定する。対象領域に関して、開始点の水平座標をStartX、開始点の垂直座標をStartY、終点の水平座標をEndX、終点の垂直座標をEndYとする。元の再生画像のＹＵＶデータは、配列Org[Width×Height]の中にあり、対象領域のＹＵＶデータは、Dst[ROIWidth×ROIHeight]の中にあり、nは対象領域内の任意の点である。そして、対象領域内のＹＵＶデータは、以下の式に従って決定され得る。
ROIWidth = EndX-StartX
ROIHeight = EndY-StartY
Dst[n] = Org[(Width×(StartY+n/ROIWidth)+StartX+n%ROIWidth)]
除算演算"/"は、端数を切り捨てて最も近い整数に丸めることを示し、シンボル"%"は、REM演算を示している。 For example, assume that the resolution of the original reproduced image is Width×Height. For the target area, let StartX be the horizontal coordinate of the start point, StartY be the vertical coordinate of the start point, EndX be the horizontal coordinate of the end point, and EndY be the vertical coordinate of the end point. The YUV data of the original reconstructed image are in the array Org[Width×Height] and the YUV data of the region of interest are in Dst[ROIWidth×ROIHeight] where n is any point in the region of interest. . The YUV data within the region of interest can then be determined according to the following equations.
ROIWidth = EndX-StartX
ROIHeight = EndY-StartY
Dst[n] = Org[(Width*(StartY+n/ROIWidth)+StartX+n%ROIWidth)]
The division operation "/" indicates rounding to the nearest integer, and the symbol "%" indicates a REM operation.

Ｓ１４０において、ビデオ再生装置は、第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに描画する。 At S140, the video playback device renders the decoded data of the first video image into the designated playback window for playback.

具体的には、ビデオ再生装置は、ポップアップウインドウ内で第１のビデオ画像を再生するか、または新しい再生ウィンドウ内に第１のビデオ画像を表示するか、または元の再生ウィンドウ内に第１のビデオ画像を表示することができ、かつ第１のビデオ画像上でデジタルズームを実行して、再生ウィンドウのサイズに合わせることができる。すなわち、本発明の実施形態において、指定されたウィンドウは、表示装置上での最大の再生ウィンドウであってもよいし、最大の再生ウィンドウの一部であってもよい。指定されたウィンドウは、現在の既存の再生ウィンドウまたは既存の再生ウィンドウの一部であってもよく、また新しいポップアップまたは新しく生成された再生ウィンドウであってもよい。指定されたウィンドウは、１つのウィンドウであってもよく、複数のウィンドウであってもよい。本発明の実施形態は、これらに限定されるわけではない。 Specifically, the video playback device plays the first video image in a pop-up window, displays the first video image in a new playback window, or displays the first video image in the original playback window. A video image can be displayed and a digital zoom can be performed on the first video image to fit the size of the playback window. That is, in embodiments of the present invention, the designated window may be the largest playback window on the display device or a portion of the largest playback window. The specified window may be the current existing playback window or part of an existing playback window, or it may be a new popup or a newly created playback window. The designated window may be one window or multiple windows. Embodiments of the invention are not limited to these.

本発明の実施形態において、オプションとして、第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに描画するステップは、
第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに、ズームイン方法で描画するステップを有していて、指定された再生ウィンドウは、第１の対象領域より大きい。 In an embodiment of the invention, optionally rendering the decoded data of the first video image into a designated playback window for playback comprises:
Rendering decoded data of the first video image in a designated playback window for playback in a zoomed-in manner, the designated playback window being larger than the first region of interest.

本発明の実施形態において、例えば、図９に示すように、第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに描画するステップは、以下のステップを有している。
Ｓ１４２: 独立した再生ウィンドウを表示する。
Ｓ１４３: 第１のビデオ画像の復号データを、再生のための独立した再生ウィンドウに描画する。 In an embodiment of the present invention, for example as shown in FIG. 9, rendering decoded data of a first video image into a designated playback window for playback comprises the following steps.
S142: Display an independent playback window.
S143: Draw the decoded data of the first video image into a separate playback window for playback.

本発明の実施形態において、指定された再生ウィンドウは、独立したポップアップ再生ウィンドウであり、この独立したポップアップ再生ウィンドウは、第１の対象領域より大きくてもよいので、第１のビデオ画像は、ズームイン方法で再生されるが、本発明の実施形態は、これに限定されるわけではないことは理解されるべきである。例えば、独立した再生ウィンドウは、第１の対象領域より小さいか、または等しくてもよい。 In an embodiment of the present invention, the designated playback window is an independent pop-up playback window, and the independent pop-up playback window may be larger than the first region of interest, so that the first video image is zoomed-in. It should be understood that, although reproduced in a manner, embodiments of the present invention are not so limited. For example, the independent playback window may be smaller than or equal to the first region of interest.

本発明の実施形態において、「Ａに対応するＢ」は、ＢがＡと関連していることを示していて、ＢがＡに従って決定され得ることは理解されるべきである。しかし、Ａに従ってＢを決定することは、ＢがＡのみに従って決定されることを意味するのではなく、ＢがＡおよび／または他の情報に従って決定され得ることも理解されるべきである。 In embodiments of the present invention, "B corresponding to A" indicates that B is related to A, and it should be understood that B can be determined according to A. However, it should also be understood that determining B according to A does not mean that B is determined according to A only, but that B may be determined according to A and/or other information.

本発明の実施形態において、前述のプロセスの続き番号は、実行順序を示しているのではないことは理解されるべきである。プロセスの実行順序は、機能およびその固有の論理に従って決定されるべきであり、本発明の実施形態の実施に、いかなる制限も設けるべきではない。 It should be understood that in embodiments of the present invention, the sequential numbering of the aforementioned processes does not indicate the order of execution. The execution order of the processes should be determined according to their functions and their inherent logic, and should not impose any restrictions on the implementation of the embodiments of the present invention.

従って、本発明の実施形態によるビデオ再生方法は、元の再生画像を複数の対象領域に分割して、トリガイベントが発生する対象領域の画像を別に表示する。従って、一態様では、ユーザは、対象領域のより明確な画像の細部を観察することができ、別の態様では、ユーザは、同時に複数の対象領域の画像の細部を追跡することができ、これによりユーザ経験を大幅に向上させる。加えて、本発明の実施形態では、元の復号データを用いて、対象領域の画像を再生するが、追加のビデオストリームを加えることはない。 Therefore, the video playback method according to an embodiment of the present invention divides the original playback image into a plurality of target regions, and separately displays the image of the target region where the trigger event occurs. Thus, in one aspect, the user is able to observe more distinct image detail in the region of interest, and in another aspect, the user is able to track image detail in multiple regions of interest simultaneously, which greatly improve the user experience. In addition, embodiments of the present invention use the original decoded data to reconstruct the image of the region of interest, but without adding an additional video stream.

図１０から図１２Ｂを参照して、以下、本発明の実施形態によるビデオ再生方法を詳細に説明する。 10 to 12B, the video playing method according to the embodiment of the present invention will now be described in detail.

図１０に示すように、ビデオ再生方法２００はビデオ再生装置によって実行され得る。
そして、この装置は、例えば、端末またはクライアントである。方法２００は、以下のステップを有し得る。
Ｓ２０１: クライアントのグラフィカルユーザインターフェイス（graphical user interface、略してＧＵＩ）を表示する。
Ｓ２０２: 再生画像分割方法を設定すべきかどうか決定する。再生画像分割方法を設定することが決定される場合、プロセスはＳ２０３へ進む。さもなければ、プロセスはＳ２０４へ進む。
Ｓ２０３: 再生画像分割方法を設定する。そして、プロセスはＳ２０４へ進む。
Ｓ２０４: ユーザが再生を始めるかどうか決定する。ユーザが再生を始めることが決定される場合、プロセスはＳ２０５へ進む。さもなければ、プロセスはＳ２０１へ進む。
Ｓ２０５: ネットワークポートを使用可能にする。
Ｓ２０６: メディアストリームを受信して、このメディアストリームを復号し、復号されたメディアストリームを、表示のための表示装置に描画する。
Ｓ２０７: ユーザが手動でイベントをトリガするかどうか決定する。ユーザが手動でイベントをトリガすることが決定される場合、プロセスはＳ２０８へ進む。さもなければ、プロセスはＳ２０９へ進む。
Ｓ２０８: ユーザが手動でイベントをトリガすることを決定する時、イベント発生領域を、指定されたウィンドウに、ズームイン方法で表示する。そして、プロセスはＳ２０６へ進む。
Ｓ２０９: 装置が自動的にイベントをトリガするかどうか決定する。装置が自動的にイベントをトリガすることが決定される場合、プロセスはＳ２１０へ進む。さもなければ、プロセスはＳ２１１へ進む。
Ｓ２１０: 装置が自動的にイベントをトリガすることを決定する時、イベント発生領域を、指定されたウィンドウに、ズームイン方法で表示する。そして、プロセスはＳ２０６へ進む。
Ｓ２１１: ユーザが再生を終えるかどうか決定する。ユーザが再生を終えることが決定される場合、プロセスはＳ２１２へ進む。さもなければ、プロセスはＳ２０６へ進む。
Ｓ２１２: ユーザがクライアントを閉じるかどうか決定する。ユーザがクライアントを閉じることが決定される場合、プロセスはＳ２１３へ進む。さもなければ、プロセスはＳ２０１へ進む。
Ｓ２１３: システムリソースをクリーンアップする。そして、ビデオ再生を終了する。 As shown in FIG. 10, video playback method 200 may be performed by a video playback device.
And this device is, for example, a terminal or a client. The method 200 may have the following steps.
S201: Display the graphical user interface (GUI for short) of the client.
S202: Determine whether to set the playback image segmentation method. If it is determined to set the playback image segmentation method, the process proceeds to S203. Otherwise, the process proceeds to S204.
S203: Set the playback image division method. The process then proceeds to S204.
S204: Determine whether the user starts playing. If the user decides to start playing, the process goes to S205. Otherwise, the process proceeds to S201.
S205: Enable the network port.
S206: Receive a media stream, decode the media stream, and render the decoded media stream on a display device for display.
S207: Determine whether the user manually triggers the event. If the user decides to manually trigger the event, the process proceeds to S208. Otherwise, the process proceeds to S209.
S208: When the user decides to manually trigger the event, display the event occurrence area in the designated window in a zoom-in manner. The process then proceeds to S206.
S209: Determine whether the device automatically triggers the event. If it is determined that the device automatically triggers the event, the process proceeds to S210. Otherwise, the process proceeds to S211.
S210: When the device determines to automatically trigger an event, display the event occurrence area in a designated window in a zoom-in manner. The process then proceeds to S206.
S211: Determine whether the user finishes playing. If the user decides to finish playing, the process proceeds to S212. Otherwise, the process proceeds to S206.
S212: Determine whether the user closes the client. If the user decides to close the client, the process proceeds to S213. Otherwise, the process proceeds to S201.
S213: Clean up system resources. Then, video playback ends.

図１１Ａは、本発明の一実施形態による、対象領域が手動でトリガされる再生方法３００の概略フローチャートを示している。方法３００は、ビデオ再生装置によって実行され得る。そして、この装置は、例えば、端末またはクライアントである。図１１Ａに示すように、方法３００は、以下のステップを有している。
Ｓ３０１: 元の再生ウィンドウ内のビデオ画像の各フレームを普通に描画して再生する。
Ｓ３０２: ユーザが手動でイベントをトリガするかどうか決定する。ユーザが手動でイベントをトリガすることが決定される場合、プロセスはＳ３０３へ進む。さもなければ、プロセスはＳ３０１へ進む。
Ｓ３０３: ユーザイベントが位置する対象領域を取得する。
Ｓ３０４: 対象領域に結合された再生ウィンドウをチェックする。
Ｓ３０５: ビデオ画像の各フレームのために、対象領域によってカバーされるビデオ画像のＹＵＶデータを計算する。
Ｓ３０６: ビデオ画像の各フレームのために、対象領域のＹＵＶデータを、再生のための指定された再生ウィンドウに描画する。例えば、図１２Ａに示すように、再生ウィンドウ全体は、元の再生画像ウィンドウと同じサイズを有する元の再生画像ウィンドウおよび３つの指定された再生ウィンドウを含み、元の再生画像は、１６の対象領域に分割される。そして、手動トリガイベントが発生する対象領域の画像は、指定された再生ウィンドウのうちの１つ上で、ズームイン方法で再生される。
Ｓ３０７: ユーザが再生を終えるかどうか決定する。ユーザが再生を終えることが決定される場合、プロセスはＳ３０８へ進む。さもなければ、プロセスはＳ３０５へ進む。
Ｓ３０８: ビデオ再生を停止する。そして、プロセスを終了する。 FIG. 11A shows a schematic flow chart of a region-of-interest manually triggered playback method 300, according to one embodiment of the present invention. Method 300 may be performed by a video playback device. And this device is, for example, a terminal or a client. As shown in FIG. 11A, method 300 includes the following steps.
S301: Draw and play back each frame of the video image in the original playback window normally.
S302: Determine whether the user manually triggers the event. If the user decides to manually trigger the event, the process proceeds to S303. Otherwise, the process proceeds to S301.
S303: Obtain a region of interest where the user event is located.
S304: Check the playback window connected to the target area.
S305: Compute the YUV data of the video image covered by the region of interest for each frame of the video image.
S306: For each frame of the video image, draw the YUV data of the region of interest into the designated playback window for playback. For example, as shown in FIG. 12A, the entire playback window includes an original playback image window having the same size as the original playback image window and three designated playback windows, where the original playback image has 16 regions of interest. divided into An image of the region of interest where the manual trigger event occurs is then played back in a zoomed-in manner on one of the designated playback windows.
S307: Determine whether the user finishes playing. If the user decides to finish playing, the process proceeds to S308. Otherwise, the process proceeds to S305.
S308: Stop video playback. Then terminate the process.

図１１Ｂは、本発明の一実施形態による、対象領域がイベントによって自動的にトリガされる再生方法４００の概略フローチャートである。この方法４００は、以下のステップを有し得る。
Ｓ４０１: 元の再生ウィンドウ内のビデオ画像の各フレームを普通に描画して再生する。
Ｓ４０２：インテリジェント分析を実行して、トリガイベントが発生するかどうか決定する。
トリガイベントが発生することが決定される場合、プロセスはＳ４０３へ進む。さもなければ、プロセスはＳ４０１へ進む。
Ｓ４０３: インテリジェント分析領域と、対象領域との間の対応関係を計算する。
Ｓ４０４: 分析イベントによってカバーされる対象領域（または複数の対象領域）を取得する。
Ｓ４０５: 対象領域に結合された再生ウィンドウをチェックする。
Ｓ４０６: ビデオ画像の各フレームのために、対象領域によってカバーされるビデオ画像のＹＵＶデータを計算する。
Ｓ４０７: ビデオ画像の各フレームのために、対象領域のＹＵＶデータを、再生のための指定された再生ウィンドウに描画する。例えば、図１２Ｂに示すように、再生ウィンドウ全体は、元の再生画像ウィンドウと同じサイズを有する元の再生画像ウィンドウおよび3つの指定された再生ウィンドウを含み、元の再生画像は、１６の対象領域に分割される。そして、トリガイベントが発生する対象領域の画像は、指定された再生ウィンドウのうちの1つ上で、ズームイン方法で再生される。
Ｓ４０８: ユーザが再生を終えるかどうか決定する。ユーザが再生を終えることが決定される場合、プロセスはＳ４０９へ進む。さもなければ、プロセスはＳ４０６へ進む。
Ｓ４０９: ビデオ再生を停止する。そして、プロセスを終了する。 FIG. 11B is a schematic flow chart of a playback method 400 in which regions of interest are automatically triggered by events, according to one embodiment of the present invention. The method 400 may have the following steps.
S401: Draw and play back each frame of the video image in the original playback window normally.
S402: Perform intelligent analysis to determine whether a trigger event occurs.
If it is determined that a trigger event will occur, the process proceeds to S403. Otherwise, the process proceeds to S401.
S403: Calculate the correspondence between the intelligent analysis area and the target area.
S404: Obtain a region of interest (or regions of interest) covered by the analysis event.
S405: Check the playback window connected to the target area.
S406: Compute the YUV data of the video image covered by the region of interest for each frame of the video image.
S407: For each frame of the video image, draw the YUV data of the region of interest into the designated playback window for playback. For example, as shown in FIG. 12B, the entire playback window includes an original playback image window having the same size as the original playback image window and three designated playback windows, where the original playback image has 16 regions of interest. divided into The image of the region of interest where the trigger event occurs is then played back in a zoomed-in manner on one of the designated playback windows.
S408: Determine whether the user finishes playing. If the user decides to finish playing, the process proceeds to S409. Otherwise, the process proceeds to S406.
S409: Stop video playback. Then terminate the process.

従って、本発明の実施形態によるビデオ再生方法は、元の再生画像を複数の対象領域に分割して、トリガイベントが発生する対象領域の画像を別に表示する。従って、一態様では、ユーザは、対象領域のより明確な画像の細部を観察することができ、別の態様では、ユーザは、同時に複数の対象領域の画像の細部を追跡することができ、これによりユーザ経験を大幅に向上させる。 Therefore, the video playback method according to an embodiment of the present invention divides the original playback image into a plurality of target regions, and separately displays the image of the target region where the trigger event occurs. Thus, in one aspect, the user is able to observe more distinct image detail in the region of interest, and in another aspect, the user is able to track image detail in multiple regions of interest simultaneously, which greatly improve the user experience.

以上、図１から図１２Ｂを参照して、本発明の実施形態によるビデオ再生方法を詳細に説明した。以下では、図１３から図２０を参照して、本発明の実施形態によるビデオ再生端末およびシステムを詳細に説明する。 The video playback method according to the embodiment of the present invention has been described in detail above with reference to FIGS. 1 to 12B. Below, the video playback terminal and system according to the embodiments of the present invention will be described in detail with reference to FIGS. 13 to 20. FIG.

図１３は、本発明の実施形態による端末５００の概略ブロック図を示している。図１３に示すように、端末５００は、
元の再生画像を少なくとも2つの対象領域に分割するように構成された分割モジュール５１０と、
分割モジュール５１０によって選び出された少なくとも２つの対象領域の中で、トリガイベントが発生する第１の対象領域を決定するように構成された第１の決定モジュール520と、第１の決定モジュール５２０によって決定された第１の対象領域の中に表示される第１のビデオ画像の復号データを取得するように構成された取得モジュール５３０と、
取得モジュール５３０によって取得された第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに描画するように構成された再生モジュール５４０とを備えている。 FIG. 13 shows a schematic block diagram of terminal 500 according to an embodiment of the present invention. As shown in FIG. 13, the terminal 500
a segmentation module 510 configured to segment the original reproduced image into at least two regions of interest;
a first determining module 520 configured to determine a first target region in which a trigger event occurs among the at least two target regions selected by the segmentation module 510; an acquisition module 530 configured to acquire decoded data of a first video image displayed within the determined first region of interest;
and a playback module 540 configured to render the decoded data of the first video image acquired by the acquisition module 530 into a designated playback window for playback.

従って、本発明の実施形態によるビデオ再生端末は、元の再生画像を複数の対象領域に分割して、トリガイベントが発生する対象領域の中の画像を別に表示する。従って、一態様では、ユーザは、対象領域のより明確な画像の細部を観察することができ、別の態様では、ユーザは、同時に複数の対象領域の画像の細部を追跡することができ、これによりユーザ経験を大幅に向上させる。 Therefore, the video playback terminal according to the embodiment of the present invention divides the original playback image into a plurality of target areas and separately displays the image in the target area where the trigger event occurs. Thus, in one aspect, the user is able to observe more distinct image detail in the region of interest, and in another aspect, the user is able to track image detail in multiple regions of interest simultaneously, which greatly improve the user experience.

本発明の実施形態において、ビデオ再生端末は、ビデオファイルを再生することができるだけではなく、リアルタイムビデオストリームを再生することもできることは理解されるべきである。本発明の実施形態は、端末がリアルタイムビデオストリームを再生する例に基づいて記載されているが、本発明の実施形態は、これに限定されるわけではない。 It should be understood that in the embodiments of the present invention, the video playing terminal can not only play video files, but also play real-time video streams. Embodiments of the present invention are described based on an example in which a terminal plays a real-time video stream, but embodiments of the present invention are not limited thereto.

本発明の実施形態において、オプションとして、図１４に示すように、端末５００は、
少なくとも２つの対象領域の中の各対象領域と、指定された再生ウィンドウとの間の対応関係を決定するように構成された第２の決定モジュール５５０を更に有していて、
再生モジュール５４０は、第２の決定モジュール５５０によって決定された対応関係に従って、取得モジュール530によって取得された第1のビデオ画像の復号データを、再生のための第１の対象領域に対応する指定された再生ウィンドウに描画するように更に構成されている。 In an embodiment of the present invention, optionally, as shown in FIG. 14, terminal 500 may:
further comprising a second determining module 550 configured to determine correspondence between each region of interest of the at least two regions of interest and the designated playback window;
The playback module 540 converts the decoded data of the first video image acquired by the acquisition module 530 to the designated corresponding first region of interest for playback according to the correspondence determined by the second determination module 550 . and is further configured to draw to the playback window.

本発明の実施形態において、オプションとして、分割モジュール５１０は、元の再生画像を少なくとも２つの対象領域に、等しい分割方法または自由な分割方法で分割するように更に構成されている。 Optionally, in an embodiment of the present invention, the segmentation module 510 is further configured to segment the original reconstructed image into at least two target regions with an equal segmentation method or a free segmentation method.

本発明の実施形態において、オプションとして、図１５に示すように、第１の決定モジュール520は、
元の再生画像の中の対象領域上でユーザによって実行されるトリガ動作を決定するように構成された第１の決定ユニット５２１を有していて、トリガ動作は、クリック動作、ダブルクリック動作、または対象領域を選択する動作を含み、
更に、第１の決定ユニット５２１によって決定されたトリガ動作が第１の対象領域として実行される対象領域を決定するように構成された第２の決定ユニット５２２を有している。 In an embodiment of the invention, optionally, as shown in Figure 15, the first decision module 520:
a first determining unit 521 configured to determine a triggering action to be performed by a user on a region of interest in the original reproduced image, the triggering action being a click action, a double-click action or comprising the act of selecting a region of interest;
Furthermore, it comprises a second determining unit 522 adapted to determine the region of interest on which the triggering action determined by the first determining unit 521 is performed as the first region of interest.

本発明の実施形態において、オプションとして、図１６に示すように、第１の決定モジュール５２０は、
元の再生画像内のトリガイベント発生ポイントの座標メタデータを取得するように構成された第１の取得ユニット５２３と、
第１の取得ユニット５２３によって取得された座標メタデータに従って、トリガイベント発生ポイントが属する対象領域を第１の対象領域として決定するように構成された第３の決定ユニット５２４とを有している。 In an embodiment of the present invention, optionally, as shown in Figure 16, the first decision module 520:
a first acquisition unit 523 configured to acquire the coordinate metadata of the trigger event occurrence point in the original playback image;
and a third determining unit 524 configured to determine the target region to which the trigger event occurrence point belongs as the first target region according to the coordinate metadata acquired by the first acquiring unit 523 .

本発明の実施形態において、オプションとして、図１７に示すように、取得モジュール530は、
元の再生画像の復号データを取得するように構成された第２の取得ユニット５３１と、
第２の取得ユニット531によって取得された元の再生画像の復号データに従って、第１のビデオ画像の復号データを決定するように構成された第３の決定ユニット５３２とを有している。 In an embodiment of the invention, optionally, as shown in Figure 17, the acquisition module 530:
a second obtaining unit 531 configured to obtain the decoded data of the original reproduced image;
and a third determining unit 532 configured to determine the decoded data of the first video image according to the decoded data of the original reproduced image obtained by the second obtaining unit 531 .

本発明の実施形態において、オプションとして、再生モジュールは、第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに、ズームイン方法で描画するように更に構成されていて、指定された再生ウィンドウは、第１の対象領域より大きい。 Optionally, in an embodiment of the present invention, the playback module is further configured to render the decoded data of the first video image into a specified playback window for playback in a zoomed-in manner, and the specified The playback window is larger than the first region of interest.

本発明の実施形態において、オプションとして、図１８に示すように、再生モジュール540は、
独立した再生ウィンドウを表示するように構成された表示ユニット５４１と、
第1のビデオ画像の復号データを、再生のための表示ユニットによって表示された独立した再生ウィンドウに描画するように構成された再生ユニット５４２とを有している。 In an embodiment of the present invention, optionally, as shown in FIG. 18, regeneration module 540:
a display unit 541 configured to display an independent playback window;
and a playback unit 542 configured to render the decoded data of the first video image into a separate playback window displayed by the display unit for playback.

本発明の実施形態によるビデオ再生端末５００が、本発明の前述の実施形態におけるビデオ再生装置に対応し得ること、および、前述および他の端末５００ジュールの動作および／または機能が、それぞれ図１から図１２Ｂにおける方法１００から４００の対応するプロセスを実行するために用いられることは理解されるべきであり、これらは、説明を簡潔にするために、ここに繰り返さない。 That the video playback terminal 500 according to embodiments of the present invention may correspond to the video playback device in the previous embodiments of the present invention, and the operation and/or functions of these and other terminal 500 modules, respectively, from FIG. It should be understood that they are used to perform the corresponding processes of methods 100-400 in FIG. 12B, which are not repeated here for the sake of brevity.

従って、本発明の実施形態によるビデオ再生端末は、元の再生画像を複数の対象領域に分割して、トリガイベントが発生する対象領域の画像を別に表示する。従って、一態様では、ユーザは、対象領域の中のより明確な画像の細部を観察することができ、別の態様では、ユーザは、同時に複数の対象領域の中の画像の細部を追跡することができ、これによりユーザ経験を大幅に向上させる。 Therefore, the video playback terminal according to the embodiment of the present invention divides the original playback image into a plurality of target areas and separately displays the image of the target area where the trigger event occurs. Thus, in one aspect, the user is able to observe more distinct image detail within the regions of interest, and in another aspect, the user is able to track image detail within multiple regions of interest simultaneously. , which greatly improves the user experience.

図１９は、本発明の一実施形態によるシステム６００の概略ブロック図である。図１９に示すように、システム６００は、
本発明の実施形態による端末６１０と、
ビデオ画像をキャプチャして、このビデオ画像をエンコードすることによってメディアストリームを生成するように構成されたビデオキャプチャシステム６２０と、
ビデオキャプチャシステムによって生成されたメディアストリームを取得して、このメディアストリームを端末620に供給するように構成されたサーバ６３０と、
サーバ630によって取得されたメディアストリームを記憶するように構成された記憶装置640とを備えている。 FIG. 19 is a schematic block diagram of a system 600 according to one embodiment of the invention. As shown in FIG. 19, system 600 includes:
a terminal 610 according to an embodiment of the invention;
a video capture system 620 configured to capture a video image and generate a media stream by encoding the video image;
a server 630 configured to obtain the media stream produced by the video capture system and provide the media stream to the terminal 620;
and a storage device 640 configured to store media streams obtained by the server 630 .

本発明の実施形態によるビデオ再生システム６００に含まれている端末６１０が、本発明の前述の実施形態におけるビデオ再生端末５００に対応し得ること、および、前述および他の端末６１０内のモジュールの動作および／または機能が、それぞれ図１から図１２Ｂにおける方法１００から４００の対応するプロセスを実行するために用いられることは理解されるべきであり、これらは、説明を簡潔にするために、ここに繰り返さない。 The terminal 610 included in the video playback system 600 according to embodiments of the present invention may correspond to the video playback terminal 500 in the previous embodiments of the present invention, and the operation of the modules in the above and other terminals 610 and/or functions are used to perform the corresponding processes of methods 100-400 in FIGS. Do not repeat.

従って、本発明の実施形態によるビデオ再生システムは、元の再生画像を複数の対象領域に分割して、トリガイベントが発生する対象領域の画像を別に表示する。従って、一態様では、ユーザは、対象領域の中のより明確な画像の細部を観察することができ、別の態様では、ユーザは、同時に複数の対象領域の中の画像の細部を追跡することができ、これによりユーザ経験を大幅に向上させる。 Therefore, the video playback system according to embodiments of the present invention divides the original playback image into multiple regions of interest and separately displays the image of the region of interest where the trigger event occurs. Thus, in one aspect, the user is able to observe more distinct image detail within the regions of interest, and in another aspect, the user is able to track image detail within multiple regions of interest simultaneously. , which greatly improves the user experience.

本発明の実施形態は、ビデオ再生端末を更に提供する。図２０に示すように、端末７００は、プロセッサ７１０、メモリ７２０、およびバスシステム７３０を備えていて、プロセッサ７１０とメモリ７２０は、バスシステム７３０を通して互いに接続されている。メモリ７２０は、命令を記憶するように構成されていて、プロセッサ７１０は、メモリ７２０に記憶された命令を実行するように構成されている。プロセッサ７１０は、元の再生画像を少なくとも2つの対象領域に分割し、少なくとも２つの対象領域の中で、トリガイベントが発生する第１の対象領域を決定し、第１の対象領域の中に表示される第１のビデオ画像の復号データを取得し、第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに描画するように構成されている。 Embodiments of the present invention further provide a video playback terminal. As shown in FIG. 20, terminal 700 comprises processor 710 , memory 720 and bus system 730 , processor 710 and memory 720 being interconnected through bus system 730 . Memory 720 is configured to store instructions, and processor 710 is configured to execute the instructions stored in memory 720 . Processor 710 divides the original reconstructed image into at least two regions of interest, determines a first region of interest among the at least two regions of interest in which the trigger event occurs, and displays in the first region of interest. and rendering the decoded data of the first video image into a designated playback window for playback.

本発明の実施形態において、プロセッサ７１０は、中央処理ユニット（Central Processing Unit、略してＣＰＵ）であってもよく、かつプロセッサ７１０は、他の一般のプロセッサ、デジタル信号プロセッサ（digital signal processor、略してＤＳＰ）、特定用途向け集積回路（application-specific integrated circuit、略してＡＳＩＣ）、フィールドプログラマブルゲートアレイ（field programmable gate array、略してＦＰＧＡ）、または他のプログラマブルロジックデバイス、独立したゲートまたはトランジスタロジックデバイス、独立したハードウェアコンポーネント等であってもよいことは理解されるべきである。一般のプロセッサは、マイクロプロセッサであってもよいし、または、このプロセッサは、任意の共通プロセッサ等であってもよい。 In an embodiment of the present invention, the processor 710 can be a Central Processing Unit (CPU for short), and the processor 710 can be any other general processor, a digital signal processor (for short). DSP), application-specific integrated circuit (ASIC for short), field programmable gate array (FPGA for short), or other programmable logic device, discrete gate or transistor logic device, It should be understood that they may be separate hardware components or the like. A common processor may be a microprocessor, or this processor may be any common processor, or the like.

メモリ７２０は、リードオンリーメモリおよびランダムアクセスメモリを含んでいてもよく、命令およびデータをプロセッサ７１０に供給する。メモリ７２０の一部が、不揮発性ランダムアクセスメモリを更に含んでいてもよい。例えば、メモリ７２０は、装置タイプ情報を更に記憶していてもよい。 Memory 720 , which may include read-only memory and random-access memory, provides instructions and data to processor 710 . A portion of memory 720 may also include non-volatile random access memory. For example, memory 720 may also store device type information.

データバスに加えて、バスシステム７３０は、パワーバス、制御バス、状態信号バス等を更に含んでいてもよい。しかし、説明を明確にするために、全てのバスは、図の中ではバスシステム７３０として示されている。 In addition to data buses, bus system 730 may also include power buses, control buses, status signal buses, and the like. However, for clarity of explanation, all buses are shown as bus system 730 in the figure.

実施の間、前述の方法の各ステップは、プロセッサ７１０内のハードウェアの集積論理回路を通して、またはソフトウェア内の命令の形で実施され得る。本発明の実施形態の中で開示された方法に関するステップは、ハードウェアプロセッサによって、またはプロセッサ内のハードウェアとソフトウェアモジュールの組み合わせによって実行されるものとして直接実施され得る。ソフトウェアモジュールは、この技術において成熟した記憶媒体、例えばランダムアクセスメモリ、フラッシュメモリ、リードオンリーメモリ、プログラマブルリードオンリーメモリ、電気的に消去可能なプログラマブルメモリ、およびレジスタの中に位置し得る。記憶媒体は、メモリ７２０の中に位置している。プロセッサ７１０は、メモリ７２０内の情報を読み出して、そのハードウェアと連動して、方法のステップを実施する。繰り返しを避けるために、詳細を再びここで述べることはしない。 During implementation, the steps of the methods described above may be implemented through integrated logic circuitry in hardware within processor 710 or in the form of instructions in software. The steps of the methods disclosed in the embodiments of the present invention can be directly implemented by a hardware processor or by a combination of hardware and software modules within the processor. A software module may reside in any storage medium mature in the art, such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, and registers. The storage media is located in memory 720 . Processor 710 reads the information in memory 720 and in conjunction with its hardware implements the steps of the method. To avoid repetition, the details are not repeated here.

一実施形態として、オプションとして、プロセッサ７１０は、少なくとも２つの対象領域の中の各対象領域と、指定された再生ウィンドウとの間の対応関係を決定するように更に構成されている。プロセッサ７１０が、第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに描画することは、対応関係に従って、第１のビデオ画像の復号データを、再生のための第１の対象領域に対応する指定された再生ウィンドウに描画することを含む。 In one embodiment, optionally the processor 710 is further configured to determine a correspondence relationship between each region of interest in the at least two regions of interest and the designated playback window. The processor 710 rendering the decoded data of the first video image into the designated playback window for playback involves rendering the decoded data of the first video image to the first video image for playback according to the correspondence relationship. Including drawing to the specified playback window corresponding to the region of interest.

一実施形態として、オプションとして、プロセッサ710が、元の再生画像を少なくとも２つの対象領域に分割することは、元の再生画像を少なくとも２つの対象領域に、等しい分割方法または自由な分割方法で分割することを含む。 As an embodiment, optionally dividing the original reproduced image into at least two target regions by the processor 710 comprises dividing the original reproduced image into at least two target regions in an equal division method or a free division method. including doing

一実施形態として、オプションとして、プロセッサ７１０が、少なくとも２つの対象領域の中で、トリガイベントが発生する第１の対象領域を決定することは、元の再生画像の中の対象領域上でユーザによって実行されるトリガ動作を判定することを含み、トリガ動作は、クリック動作、ダブルクリック動作、または対象領域を選択する動作を含み、更に、トリガ動作が実行される対象領域を第１の対象領域として決定することを含む。 As an embodiment, optionally the processor 710 determining a first region of interest in which the trigger event occurs among the at least two regions of interest is performed by a user on the region of interest in the original reproduced image. determining a triggering action to be performed, the triggering action including a clicking action, a double-clicking action, or an action of selecting a target area; and further determining the target area on which the triggering action is performed as a first target area. Including deciding.

一実施形態として、オプションとして、プロセッサ７１０が、少なくとも２つの対象領域の中で、トリガイベントが発生する第１の対象領域を決定することは、元の再生画像の中のトリガイベント発生ポイントの座標メタデータを取得することと、座標メタデータに従って、トリガイベント発生ポイントが属する対象領域を第１の対象領域として決定することとを含む。 Optionally, in one embodiment, the processor 710 determining a first region of interest, among the at least two regions of interest, where the trigger event occurs is based on the coordinates of the point of occurrence of the trigger event in the original reconstructed image. obtaining metadata; and determining a region of interest to which the trigger event occurrence point belongs as a first region of interest according to the coordinate metadata.

一実施形態として、オプションとして、プロセッサ７１０が、第１の対象領域の中に表示される第１のビデオ画像の復号データを取得することは、元の再生画像の復号データを取得することと、元の再生画像の復号データに従って、第１のビデオ画像の復号データを決定することとを含む。 In one embodiment, optionally obtaining the decoded data of the first video image displayed in the first region of interest by the processor 710 includes obtaining the decoded data of the original reproduced image; determining decoded data for the first video image according to decoded data for the original reconstructed image.

一実施形態として、オプションとして、端末７００は、第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに、ズームイン方法で描画するように更に構成されていて、指定された再生ウィンドウは、第１の対象領域より大きい。 Optionally, in an embodiment, the terminal 700 is further configured to render the decoded data of the first video image in a specified playback window for playback in a zoomed-in manner, and the specified playback is performed. The window is larger than the first region of interest.

一実施形態として、オプションとして、端末７００は、ディスプレイ７４０を更に備えている。プロセッサ７１０が、第１のビデオ画像の復号データを、再生のための指定された再生ウィンドウに描画することは、独立した再生ウィンドウを表示することを含み、ディスプレイ７４０は、第１のビデオ画像の復号データを、再生のための独立した再生ウィンドウに描画するように構成されている。 In one embodiment, terminal 700 optionally further comprises a display 740 . The processor 710 rendering the decoded data of the first video image into the designated playback window for playback includes displaying the independent playback window, the display 740 displaying the first video image. It is configured to render the decoded data into a separate playback window for playback.

本発明の実施形態によるビデオ再生端末７００が、本発明の前述の実施形態におけるビデオ再生端末500または端末610に対応し得ること、および、前述および他の端末700内のモジュールの動作および／または機能が、それぞれ図１から図１２Ｂにおける方法１００から４００の対応するプロセスを実行するために用いられることは理解されるべきであり、これらは、説明を簡潔にするために、ここに繰り返さない。 Video playback terminal 700 according to embodiments of the present invention may correspond to video playback terminal 500 or terminal 610 in previous embodiments of the present invention, and operation and/or functionality of modules within these and other terminals 700 are used to perform the corresponding processes of methods 100-400 in FIGS. 1-12B, respectively, which are not repeated here for the sake of brevity.

この明細書の中で開示された実施形態に記載されている例と組み合わせて、ユニットおよびアルゴリズムステップが、電子的ハードウェア、コンピュータソフトウェア、またはそれらの組み合わせによって実施され得ることに、当業者は気づき得る。ハードウェアとソフトウェアの間の互換性を明確に述べるために、上記では、機能に従って、各例の構成およびステップを概略説明した。機能がハードウェアによって実行されるのか、またはソフトウェアによって実行されるのかは、具体的なアプリケーションおよび技術的解決策の設計制約条件による。当業者は、異なる方法を用いて、各々の特定のアプリケーションのために、記載された機能を実施することができるが、この実施は、本発明の範囲を越えると考えるべきではない。 Those skilled in the art will realize that the units and algorithm steps can be implemented by electronic hardware, computer software, or a combination thereof, in combination with the examples described in the embodiments disclosed herein. obtain. To clearly state the compatibility between hardware and software, the above outlines the configuration and steps of each example according to function. Whether a function is performed by hardware or by software depends on the specific application and design constraints of the technical solution. Skilled artisans may use different methods to implement the described functionality for each particular application, but this implementation should not be considered beyond the scope of the invention.

便利で簡潔な説明のために、前述のシステム、装置、およびユニットの詳細な動作プロセスに対して、参照が方法の実施形態の中の対応するプロセスになされることができ、詳細が再度ここに記載されないことは、当業者によって明確に理解され得る。 For convenience and concise description, to the detailed operating processes of the aforementioned systems, devices, and units, reference can be made to the corresponding processes in the method embodiments, and the details are here again. What is not described can be clearly understood by those skilled in the art.

本願の中で提示されているいくつかの実施形態において、開示されたシステム、装置、および方法が、他のやり方で実施され得ることは理解されるべきである。例えば、記載した装置の実施形態は、単なる例示である。例えば、ユニットの分割は、単なる論理的機能の分割であって、実際の実施においては他の分割であってもよい。例えば、複数のユニットまたは構成要素は、組み合わせることができ、または他のシステムの中に組み込むこともでき、またはいくつかの特徴を無視するかまたは実行しないこともできる。加えて、示されたか述べられた相互結合または直接結合または通信接続は、いくつかのインターフェイスを通して実施され得る。装置間またはユニット間の間接結合または通信接続は、電子的、機械的、または他の形で実施され得る。 It should be understood that in some of the embodiments presented in this application, the disclosed systems, devices, and methods may be implemented otherwise. For example, the described apparatus embodiment is merely exemplary. For example, the division of units is merely logical function division, and may be other divisions in actual implementation. For example, multiple units or components may be combined or incorporated into other systems, or some features may be ignored or not performed. In addition, mutual couplings or direct couplings or communication connections shown or described may be implemented through some interfaces. Indirect couplings or communicative connections between devices or units may be implemented electronically, mechanically, or otherwise.

別々の部分として述べられたユニットは、物理的に別々でもよいし、そうでなくてもよい。そして、ユニットとして示された部分は、物理ユニットであってもよいし、そうでなくてもよく、１つの位置にあってもよいし、複数のネットワークユニット上に分散されていてもよい。ここでのユニットの一部または全部は、本発明の実施形態の解決策の目的を達成するための実際の必要に従って選択され得る。 Units described as separate parts may or may not be physically separate. And the parts shown as units may or may not be physical units and may be in one location or distributed over multiple network units. Part or all of the units here can be selected according to actual needs to achieve the solution objectives of the embodiments of the present invention.

加えて、本発明の実施形態における機能ユニットは、１つの処理ユニットの中に組み込まれていてもよいし、ユニットの各々が、物理的に単独で存在していてもよいし、２つ以上のユニットが、１つのユニットの中に組み込まれていてもよい。組み込まれたユニットは、ハードウェアの形で実施されてもよいし、ソフトウェア機能ユニットの形で実施されてもよい。 In addition, the functional units in the embodiments of the present invention may be incorporated within one processing unit, each of the units may physically exist alone, or two or more processing units may exist. Units may be incorporated into one unit. Embedded units may be implemented in the form of hardware or in the form of software functional units.

組み込まれたユニットがソフトウェア機能ユニットの形で実施され、独立した製品として販売または使用される時、組み込まれたユニットは、コンピュータ可読記憶媒体に記憶され得る。このような理解に基づいて、本発明の技術的解決策は本質的に、または従来技術に関与している部分、または技術的解決策の全部または一部は、ソフトウェア製品の形で実施され得る。コンピュータソフトウェア製品は、記憶媒体に記憶され、コンピュータ装置（それは、パーソナルコンピュータ、サーバ、ネットワーク装置等であってもよい）に命令するためのいくつかの命令を含み、本発明の実施形態に記載されている方法のステップの全部または一部を実行する。前述の記憶媒体は、プログラムコードを記憶することができるいかなる媒体、例えばＵＳＢフラッシュドライブ、リムーバブルハードディスク、リードオンリーメモリ（Read-Only Memory、略してＲＯＭ）、ランダムアクセスメモリ（Random Access Memory、ＲＡＭ）、磁気ディスク、または光ディスクも含む。 When the embedded unit is embodied in the form of a software functional unit and sold or used as a stand-alone product, the embedded unit can be stored in a computer readable storage medium. Based on this understanding, the technical solution of the present invention essentially or the part involved in the prior art, or all or part of the technical solution can be implemented in the form of software products. . A computer software product is stored on a storage medium and includes a number of instructions for instructing a computer device (it may be a personal computer, a server, a network device, etc.), and is described in the embodiments of the present invention. perform all or part of the steps of the method The aforementioned storage medium is any medium capable of storing program code, such as a USB flash drive, a removable hard disk, a read-only memory (ROM for short), a random access memory (RAM), Also includes magnetic or optical disks.

前述の説明は、単に本発明の具体的な実施形態に過ぎず、本発明の保護範囲を限定することを意図しているのではない。本発明の技術的範囲の中の、当業者によって直ちに理解される、いかなる等価な変形または置換も、本発明の保護範囲に入るものとする。従って、本発明の保護範囲は、請求項の保護範囲に従うものとする。 The foregoing descriptions are merely specific embodiments of the present invention and are not intended to limit the protection scope of the present invention. Any equivalent variation or replacement readily figured out by a person skilled in the art within the technical scope of the present invention shall fall within the protection scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

本発明は、ビデオ監視に利用することができる。 The invention can be used for video surveillance.

５００端末
５１０分割モジュール
５２０第１の決定モジュール
５３０取得モジュール
５４０再生モジュール
６００システム
６１０端末
６２０ビデオキャプチャシステム
６３０サーバ
６４０記憶装置 500 terminal 510 segmentation module 520 first determination module 530 acquisition module 540 playback module 600 system 610 terminal 620 video capture system 630 server 640 storage device

Claims

receiving a live video stream;
dividing the live video stream into at least two regions of interest;
determining the first region of interest containing the triggering event by applying coordinate metadata identifying a first region of interest associated with the triggering event among the at least two regions of interest; When,
determining decoded data associated with the first region of interest from the live video stream;
rendering the decoded data in a separate playback window to allow viewing of a zoomed version of the first region of interest of the live video stream;
rendering the live video stream in a second playback window to allow viewing of the live video stream concurrently with the zoomed version of the first region of interest of the live video stream;
splitting the live video stream occurs prior to determining the first region of interest containing the trigger event;
Rendering the decoded data in the separate playback window to allow viewing of a zoomed version of the first region of interest of the live video stream comprises: zooming in on the decoded data; rendering the data to the independent playback window.

2. The method of claim 1, further comprising: determining a correspondence between each region of interest among the at least two regions of interest and the independent playback window.

receiving from a server the trigger event and the coordinate metadata associated with the trigger event;
2. The step of claim 1, wherein the step of receiving the trigger event and the coordinate metadata associated with the trigger event comprises determining whether the trigger event has occurred with a motion detector. the method of.

obtaining from a server the trigger event and the coordinate metadata associated with the trigger event;
The step of receiving the trigger event and the coordinate metadata associated with the trigger event includes determining whether the trigger event was caused by an automatically generated trigger event. The method of claim 1.

2. The method of claim 1, wherein said independent playback window is larger than said first region of interest.

2. The method of claim 1, wherein said independent playback window is smaller than said first region of interest.

2. The method of claim 1, wherein all of the at least two regions of interest together comprise less than the entire live video stream.

2. The method of claim 1, wherein at least two of the at least two regions of interest have unequal sizes.

2. The method of claim 1, wherein the step of dividing the live video stream into at least two regions of interest occurs in response to manual user interaction.

of the at least two regions of interest, by applying coordinate metadata associated with the second trigger event identifying the second region of interest that includes the second trigger event; determining a region;
determining second decoded data associated with the second region of interest from the live video stream;
rendering said second decoded data in said separate playback window to allow viewing of a zoomed version of said second region of interest of said live video stream;
rendering the second decoded data in the separate playback window to allow viewing of a zoomed version of the second region of interest of the live video stream;
2. The method of claim 1, comprising drawing the second decoded data into the separate playback window in a zoomed -in manner.

2. The method of claim 1, wherein the trigger event satisfies an automatic event detection rule.

a processor;
and a non-persistent computer-readable medium storing instructions that, when executed by said processor, implement the method of any of claims 1-11.

A video playback terminal,
a segmentation module configured to segment a video stream into at least two regions of interest;
a first determining module configured to determine a first region of interest among the at least two regions of interest in which a trigger event occurs, the first determining module comprising: a first determining unit configured to determine a triggering action to be performed over a region of interest by the a first decision module;
a first acquisition module configured to acquire, from a central server, coordinate metadata of points of origin associated with the trigger event;
a second acquisition module configured to acquire decoded data for video in the first region of interest determined by the first determination module;
a playback module configured to render the decoded data acquired by the second acquisition module into a separate playback window;
a second playback module configured to render the video stream into a second playback window at the same time that the decoded data is rendered into the separate playback window;
the segmentation module is further configured to segment the video stream prior to the first determination module determining the first region of interest in which the trigger event occurs;
A video playback terminal, wherein the second playback module is further configured to render the decoded data into the separate playback window in a zoom-in manner.

14. The method of claim 13, further comprising: a second determining module configured to determine correspondence between each region of interest among the at least two regions of interest and the independent playback window. terminal.

The playback module is
a display unit configured to display independent playback windows;
14. The terminal of claim 13, comprising: a playback unit configured to render the decoded data into the separate playback window displayed by the display unit.

a video playback device;
a video capture system configured to capture and generate a media stream by encoding the video stream and responsive to a motion detector;
a server configured to obtain the media stream generated by the video capture system and to provide the media stream to the video playback terminal;
a storage device configured to store the media stream obtained by the server;
The video playback terminal is
a segmentation module configured to segment the video stream into at least two regions of interest;
a first determining module configured to determine a first region of interest among the at least two regions of interest divided by the dividing module, from the server a trigger caused by the motion detector; Obtaining coordinate metadata of an occurrence point in the media stream based on the event, and determining the target area to which the trigger event occurrence point belongs as the first target area according to the coordinate metadata. a first decision module, further configured to:
an acquisition module configured to acquire decoded data of the first region of interest determined by the first determination module;
a playback module configured to render the decoded data acquired by the acquisition module into a separate playback window;
a second playback module configured to render the video stream into a second playback window at the same time that the decoded data is rendered into the separate playback window;
the segmentation module is further configured to segment the video stream prior to the first determination module determining the first region of interest containing the trigger event;
A video playback system, wherein the second playback module is further configured to render the decoded data into the separate playback window in a zoom-in manner.