JPWO2020095728A1

JPWO2020095728A1 - Information processing device and information processing method

Info

Publication number: JPWO2020095728A1
Application number: JP2020555963A
Authority: JP
Inventors: 哲博内田; 祐介阪井; 美和市川
Original assignee: Sony Corp; Sony Group Corp
Current assignee: Sony Corp; Sony Group Corp
Priority date: 2018-11-06
Filing date: 2019-10-25
Publication date: 2021-09-24
Anticipated expiration: 2039-10-25
Also published as: EP3879819A1; WO2020095728A1; JP7420078B2; US20210399913A1; CN113330735A; US11641448B2; EP3879819A4

Abstract

【課題】ユーザに与える遅延感を減少させることができる情報処理装置及び情報処理方法を提供する。
【解決手段】情報処理装置は、取得部と、エンコード部と、コンテキスト認識部と、優先データ抽出部と、通信部と、を具備する。取得部は、送信地点に関するデータを取得する。エンコード部は、送信地点に関するデータをエンコードする。コンテキスト認識部は、送信地点に関するデータを用いて認識した送信地点の状況に基づいて、送信地点に関するデータから、優先的に送信するデータを設定する。優先データ抽出部は、コンテキスト認識部での設定に基づいて、優先的に送信するデータを優先データとして抽出する。通信部は、エンコード部によりエンコードされたデータと、エンコードされていない優先データを、受信地点にある情報処理装置へ送信する。
【選択図】図１PROBLEM TO BE SOLVED: To provide an information processing device and an information processing method capable of reducing a feeling of delay given to a user.
An information processing device includes an acquisition unit, an encoding unit, a context recognition unit, a priority data extraction unit, and a communication unit. The acquisition unit acquires data related to the transmission point. The encoding unit encodes the data related to the transmission point. The context recognition unit sets the data to be preferentially transmitted from the data related to the transmission point based on the situation of the transmission point recognized by using the data related to the transmission point. The priority data extraction unit extracts the data to be preferentially transmitted as the priority data based on the setting in the context recognition unit. The communication unit transmits the data encoded by the encoding unit and the unencoded priority data to the information processing device at the receiving point.
[Selection diagram] Fig. 1

Description

本技術は、情報処理装置及び情報処理方法に関する。 The present technology relates to an information processing device and an information processing method.

テレビ会議等、通信網を利用して異なる地点間で映像音声データを双方向に伝送し、リアルタイムで情報交換することが可能となってきている（例えば特許文献１参照）。 It has become possible to bidirectionally transmit video and audio data between different points using a communication network such as a video conference and exchange information in real time (see, for example, Patent Document 1).

特許文献１には、テレビ会議の会話量や会話の盛り上がり度に基づき映像音声データの低遅延優先度を決定している。 In Patent Document 1, the low delay priority of the video / audio data is determined based on the conversation volume of the video conference and the degree of excitement of the conversation.

特開２００９−７６９５２号公報Japanese Unexamined Patent Publication No. 2009-76952

特許文献１に記載される技術では、例えばユーザ間の会話が無く、異なる地点にいる二者間の映像同期が求められるような場面において、映像音声データの遅延により適切な映像をユーザに提供することが難しい。 In the technique described in Patent Document 1, for example, in a situation where there is no conversation between users and video synchronization between two parties at different points is required, an appropriate video is provided to the user by delaying the video / audio data. It's difficult.

以上のような事情に鑑み、本技術の目的は、ユーザに与える遅延感を減少させることができる情報処理装置及び情報処理方法を提供することにある。 In view of the above circumstances, an object of the present technology is to provide an information processing device and an information processing method capable of reducing a feeling of delay given to a user.

上記目的を達成するため、本技術の一形態に係る情報処理装置は、取得部と、エンコード部と、コンテキスト認識部と、優先データ抽出部と、通信部と、を具備する。
上記取得部は、送信地点に関するデータを取得する。
上記エンコード部は、上記送信地点に関するデータをエンコードする。
上記コンテキスト認識部は、上記送信地点に関するデータを用いて認識した上記送信地点の状況に基づいて、上記送信地点に関するデータから、優先的に送信するデータを設定する。
上記優先データ抽出部は、上記コンテキスト認識部での設定に基づいて、上記優先的に送信するデータを優先データとして抽出する。
上記通信部は、上記エンコード部によりエンコードされたデータと、エンコードされていない上記優先データを、受信地点にある情報処理装置へ送信する。In order to achieve the above object, the information processing apparatus according to one form of the present technology includes an acquisition unit, an encoding unit, a context recognition unit, a priority data extraction unit, and a communication unit.
The acquisition unit acquires data related to the transmission point.
The encoding unit encodes data related to the transmission point.
The context recognition unit sets data to be preferentially transmitted from the data related to the transmission point based on the situation of the transmission point recognized by using the data related to the transmission point.
The priority data extraction unit extracts the data to be preferentially transmitted as priority data based on the settings in the context recognition unit.
The communication unit transmits the data encoded by the encoding unit and the unencoded priority data to the information processing device at the receiving point.

このような構成によれば、送信地点の状況に基づいて、受信地点にある情報処理装置へ優先して送信されるデータが抽出され、そのデータはエンコードされずに受信地点にある情報処理装置に送信される。これにより、優先的に送信されるデータはエンコード処理に要する時間が不要のため、エンコード処理されるデータよりも早く受信地点にある情報処理装置に送信することができる。 According to such a configuration, data to be preferentially transmitted to the information processing device at the receiving point is extracted based on the situation at the transmitting point, and the data is not encoded and is sent to the information processing device at the receiving point. Will be sent. As a result, the data to be preferentially transmitted does not require the time required for the encoding process, so that the data can be transmitted to the information processing device at the receiving point earlier than the data to be encoded.

上記優先データ抽出部は、上記優先データとして、上記優先的に送信するデータと、上記送信地点の状況と、上記優先的に送信するデータの再生時刻と、を抽出してもよい。 The priority data extraction unit may extract the data to be preferentially transmitted, the status of the transmission point, and the reproduction time of the data to be preferentially transmitted as the priority data.

上記優先データを保存する保存部と、上記保存部に保存された優先データを基に、優先的に送信するデータを予測する優先データ予測部とを更に具備してもよい。 A storage unit for storing the priority data and a priority data prediction unit for predicting data to be preferentially transmitted based on the priority data stored in the storage unit may be further provided.

上記送信地点に関するデータは、映像データを含んでもよい。
上記送信地点に関するデータは、音データとデプスデータの少なくとも一方を更に含んでもよい。The data regarding the transmission point may include video data.
The data regarding the transmission point may further include at least one of sound data and depth data.

上記目的を達成するため、本技術の一形態に係る情報処理装置は、通信部と、デコード部と、決定部と、再生データ生成部と、出力部と、を具備する。
上記通信部は、送信地点の情報処理装置から、上記送信地点に関するデータがエンコードされたデータと、上記送信地点に関するデータから抽出されエンコードされていない優先データを受信する。
上記デコード部は、上記エンコードされたデータをデコードする。
上記決定部は、上記エンコードされていない優先データの再生時刻及び再生方法を決定する。
上記再生データ生成部は、上記決定部での決定に基づいて、上記優先データの再生データを生成する。
上記出力部は、上記デコード部でデコードされたデータと上記優先データの再生データを出力する。In order to achieve the above object, the information processing device according to one embodiment of the present technology includes a communication unit, a decoding unit, a determination unit, a reproduction data generation unit, and an output unit.
The communication unit receives from the information processing device at the transmission point the data in which the data relating to the transmission point is encoded and the priority data extracted from the data relating to the transmission point and not encoded.
The decoding unit decodes the encoded data.
The determination unit determines the reproduction time and reproduction method of the unencoded priority data.
The reproduction data generation unit generates the reproduction data of the priority data based on the determination in the determination unit.
The output unit outputs the data decoded by the decoding unit and the reproduction data of the priority data.

このような構成によれば、エンコードされていない優先データはデコード処理が不要なため、エンコードされたデータよりも、早く再生することができる。 According to such a configuration, the unencoded priority data does not need to be decoded, so that it can be reproduced faster than the encoded data.

上記決定部での決定の内容を保存する保存部と、上記保存部に保存された決定内容を参照して、上記デコードされたデータのうち、上記優先データの再生データによって既に再生されているデータがあるか否かを確認する再生済み確認部と、上記再生済み確認部により上記優先データの再生データが既に再生されている場合に、上記優先データの再生データと上記デコードされたデータを繋ぎ合わせるための補間データを生成する補間データ生成部とを更に具備してもよい。 Of the decoded data, the data that has already been reproduced by the reproduction data of the priority data by referring to the storage unit that stores the content of the decision in the determination unit and the determination content stored in the storage unit. When the reproduced data of the priority data has already been reproduced by the reproduced confirmation unit for confirming whether or not there is, the reproduced data of the priority data and the decoded data are connected. It may further include an interpolation data generation unit that generates interpolation data for the purpose.

上記目的を達成するため、本技術の一形態に係る情報処理方法は、
送信地点にある情報処理装置が、
送信地点に関するデータを取得し、
上記送信地点に関するデータをエンコードし、
上記送信地点に関するデータを用いて認識した上記送信地点の状況に基づいて、上記送信地点に関するデータから、優先的に送信するデータを優先データとして抽出し、
上記エンコードしたデータと、エンコードしていない上記優先データを、受信地点にある情報処理装置に送信し、
上記受信地点にある情報処理装置が、
上記エンコードしたデータと、エンコードしていない上記優先データを受信し、
上記エンコードしたデータをデコードし、
エンコードしていない上記優先データの再生時刻及び再生方法を決定し、
上記決定に基づいて上記優先データの再生データを生成し、
上記デコードしたデータと上記優先データの再生データを出力する。In order to achieve the above object, the information processing method according to one form of the present technology is
The information processing device at the transmission point
Get data about the transmission point,
Encode the data related to the above transmission point and
Based on the situation of the transmission point recognized by using the data related to the transmission point, the data to be preferentially transmitted is extracted as the priority data from the data related to the transmission point.
The encoded data and the unencoded priority data are transmitted to the information processing device at the receiving point.
The information processing device at the above reception point
Receive the above encoded data and the above unencoded priority data,
Decode the above encoded data and
Determine the playback time and playback method of the above priority data that has not been encoded.
Based on the above determination, the reproduction data of the above priority data is generated, and the reproduction data is generated.
The decoded data and the reproduced data of the priority data are output.

本技術の一実施形態に係る情報処理装置が用いられた情報処理システム及び情報処理装置の構成を示す図である。It is a figure which shows the structure of the information processing system and the information processing apparatus which used the information processing apparatus which concerns on one Embodiment of this technique. 送信地点にある情報処理装置での遅延制御に係る情報処理方法のフロー図である。It is a flow chart of the information processing method related to the delay control in the information processing apparatus at the transmission point. 受信地点にある情報処理装置での遅延制御に係る情報処理方法のフロー図である。It is a flow chart of the information processing method related to the delay control in the information processing apparatus at the receiving point. 本技術の一実施形態に係る情報処理方法での遅延制御の具体例を説明する図である。It is a figure explaining the specific example of the delay control in the information processing method which concerns on one Embodiment of this technique. 比較例に係る情報処理方法での映像遅延の具体例を説明する図である。It is a figure explaining the specific example of the image delay in the information processing method which concerns on a comparative example.

本開示の一実施形態に係る情報処理装置及びこれを用いた情報処理システムについて説明する。情報処理システムは、互いに異なる地点に設置される２つの情報処理装置間での映像データ及び音声データを双方向に伝送する通信システムに係る。 An information processing device according to an embodiment of the present disclosure and an information processing system using the same will be described. The information processing system relates to a communication system that bidirectionally transmits video data and audio data between two information processing devices installed at different points.

本実施形態の情報処理システムでは、地点で情報処理装置によって取得される地点の情報に関する映像データや音データといったセンシングデータを用いて、情報処理装置が設置される地点の状況（以下、シーンと称する場合がある。）が認識される。 In the information processing system of the present embodiment, the situation of the point where the information processing device is installed (hereinafter referred to as a scene) using sensing data such as video data and sound data related to the information of the point acquired by the information processing device at the point. In some cases.) Is recognized.

送信地点の情報処理装置で取得されたセンシングデータはＡＶコーデックを通しエンコードされて、受信地点の情報処理装置に送信される。
また、送信地点の情報処理装置では、シーンに応じて、センシングデータから、受信地点の情報処理装置に優先的に送るデータ（以下、優先データと称する場合がある。）が抽出される。抽出された優先データは、ＡＶコーデックを通さずに、エンコードされるデータとは別送で受信地点にある情報処理装置に送信される。The sensing data acquired by the information processing device at the transmission point is encoded through the AV codec and transmitted to the information processing device at the reception point.
Further, in the information processing device at the transmission point, data to be preferentially sent to the information processing device at the reception point (hereinafter, may be referred to as priority data) is extracted from the sensing data according to the scene. The extracted priority data is transmitted to the information processing device at the receiving point separately from the encoded data without passing through the AV codec.

受信地点にある情報処理装置では、エンコードされたデータと、エンコードされていない優先データが再生される。 The information processing device at the receiving point reproduces the encoded data and the unencoded priority data.

このように、本実施形態の情報処理システムでは、ＡＶコーデックを通さずに優先データが別送され再生されることにより、ＡＶコーデック分の遅延がなくなり、優先データが速やかに再生される。これにより、受信地点の情報処理装置では、状況に応じた、遅延が減少された映像データや音声データが受信されることになり、ユーザに与える遅延感を減少させることができる。
以下、詳細に説明する。As described above, in the information processing system of the present embodiment, the priority data is separately transmitted and reproduced without passing through the AV codec, so that the delay for the AV codec is eliminated and the priority data is reproduced promptly. As a result, the information processing device at the receiving point receives video data and audio data with reduced delay depending on the situation, and can reduce the feeling of delay given to the user.
Hereinafter, a detailed description will be given.

（情報処理システムの構成）
図１は、本実施形態に係る情報処理システム５０の構成を示す。
図１に示すように、情報処理システム５０は、第１の情報処理システム２０Ａと、第２の情報処理システム２０Ｂと、を有する。これら２つの情報処理システム２０Ａ及び２０Ｂは、ネットワーク３０を介して双方向に通信可能となっている。(Configuration of information processing system)
FIG. 1 shows the configuration of the information processing system 50 according to the present embodiment.
As shown in FIG. 1, the information processing system 50 includes a first information processing system 20A and a second information processing system 20B. These two information processing systems 20A and 20B can communicate in both directions via the network 30.

本実施形態では、第１の情報処理システム２０Ａを用いるユーザをＡさんとし、第２の情報処理システム２０Ｂを用いるユーザをＢさんとする。第１の情報処理システム２０Ａは、ＡさんのいるＡ地点に設置される。第２の情報処理システム２０Ｂは、ＢさんのいるＢ地点に設置される。Ａ地点とＢ地点とは異なる場所にある。Ａ地点が送信地点のとき、Ｂ地点は受信地点となり、Ｂ地点が送信地点のとき、Ａ地点は受信地点となる。 In the present embodiment, the user who uses the first information processing system 20A is Mr. A, and the user who uses the second information processing system 20B is Mr. B. The first information processing system 20A is installed at the point A where Mr. A is. The second information processing system 20B is installed at the point B where Mr. B is. It is in a different place from point A and point B. When point A is a transmission point, point B is a reception point, and when point B is a transmission point, point A is a reception point.

ネットワーク３０は、インターネット、電話回線網、衛星通信網などの公衆回線網や、Ｅｔｈｅｒｎｅｔ（登録商標）を含む各種のＬＡＮ（Local Area Network）、ＷＡＮ（Wide Area Network）などを含んでもよい。また、ネットワーク３０は、ＩＰ−ＶＰＮ（Internet Protocol-Virtual Private Network）などの専用回線網を含んでもよい。また、ネットワーク３０は、Ｗｉ−Ｆｉ（登録商標）、Ｂｌｕｅｔｏｏｔｈ（登録商標）など無線通信網を含んでも良い。 The network 30 may include a public line network such as the Internet, a telephone line network, a satellite communication network, various LANs (Local Area Network) including Ethernet (registered trademark), and a WAN (Wide Area Network). Further, the network 30 may include a dedicated line network such as IP-VPN (Internet Protocol-Virtual Private Network). Further, the network 30 may include a wireless communication network such as Wi-Fi (registered trademark) and Bluetooth (registered trademark).

第１の情報処理システム２０Ａと第２の情報処理システム２０Ｂとは同じ構成を有する。ここでは、第１の情報処理システム２０Ａを構成する各構成に付与する符号の語尾にＡをつけ、第２の情報処理システム２０Ｂを構成する各構成に不要する符号の語尾にＢをつけ、両者を区別する。
以下の説明で、第１の情報処理システム２０Ａと第２の情報処理システム２０Ｂとを特に分ける必要がない場合には、Ａ、Ｂの記載を省略する。The first information processing system 20A and the second information processing system 20B have the same configuration. Here, A is added to the end of the code given to each configuration constituting the first information processing system 20A, and B is added to the end of the code unnecessary for each configuration constituting the second information processing system 20B. To distinguish.
In the following description, if it is not necessary to particularly separate the first information processing system 20A and the second information processing system 20B, the description of A and B will be omitted.

情報処理システム２０は、情報処理装置２１と、センサ群２２と、再生部２３と、を有する。情報処理装置２１の構成については後述する。 The information processing system 20 includes an information processing device 21, a sensor group 22, and a reproduction unit 23. The configuration of the information processing device 21 will be described later.

センサ群２２は、映像用カメラ２２１と、集音部としてのマイク２２２と、デプスセンサ２２３と、赤外線カメラ２２４等の各種センサを有する。各種センサで取得されるセンシングデータには、地点に関する情報（データ）が含まれる。地点に関するデータとは、具体的には、地点にいる人やその人の周囲の映像データ、音データ、デプスデータ等である。 The sensor group 22 includes a video camera 221, a microphone 222 as a sound collecting unit, a depth sensor 223, and various sensors such as an infrared camera 224. Sensing data acquired by various sensors includes information (data) about points. Specifically, the data related to the point is video data, sound data, depth data, etc. of the person at the point and the surroundings of the person.

本実施形態では、センシングデータを用いて、地点の状況（シーン）が認識される。シーンの認識は、互いに通信する複数の地点のうち、少なくともいずれか１つの地点に関するセンシングデータを用いて認識される。 In the present embodiment, the situation (scene) of the point is recognized by using the sensing data. Scene recognition is recognized using sensing data for at least one of a plurality of points communicating with each other.

シーンの認識は、複数の地点それぞれで得られるセンシングデータを用いて行われてもよい、通信相手の情報処理装置で得られるセンシングデータと、自身（情報処理装置）が取得したセンシングデータに基づいて、二者間で行われているシーンが認識されてもよい。 Scene recognition may be performed using sensing data obtained at each of a plurality of points, based on the sensing data obtained by the information processing device of the communication partner and the sensing data acquired by itself (information processing device). , The scene being performed between the two parties may be recognized.

映像用カメラ２２１は、地点の映像データを取得する。
マイク２２２は、地点での音、例えば人の声や環境音を集音し、音データを取得する。
デプスセンサ２２３は、例えば、赤外光を用いて、地点にいる人や物体のデプスセンサからの距離を示すデプスデータを取得する。デプスセンサの方式にはＴＯＦ(Time of Flight)方式、パターン照射方式、ステレオカメラ方式等の任意の方式を採用することが可能である。
赤外線カメラ２２４は、人や物体等の赤外線画像データを取得する。赤外線画像データにより人の骨格推定等を行うことができる。The video camera 221 acquires video data at a point.
The microphone 222 collects sound at a point, for example, a human voice or an environmental sound, and acquires sound data.
The depth sensor 223 uses, for example, infrared light to acquire depth data indicating the distance of a person or an object at a point from the depth sensor. Any method such as a TOF (Time of Flight) method, a pattern irradiation method, and a stereo camera method can be adopted as the depth sensor method.
The infrared camera 224 acquires infrared image data of a person, an object, or the like. Human skeleton can be estimated from infrared image data.

再生部２３は、映像再生機２３１と、音声再生機２３２と、表示部２３３と、音声出力部であるスピーカ２３４を有する。 The playback unit 23 includes a video player 231 and an audio player 232, a display unit 233, and a speaker 234 which is an audio output unit.

映像再生機２３１は、後述する情報処理装置２１の再生データ出力部１５から出力され、入力された優先データ、予測優先データ、非優先データに基づく再生データについて、Ｄ／Ａ変換及び増幅などの再生処理を行い、表示部２３３に映像を表示させる。 The video player 231 reproduces the reproduction data such as D / A conversion and amplification with respect to the reproduction data output from the reproduction data output unit 15 of the information processing apparatus 21 described later and based on the input priority data, prediction priority data, and non-priority data. The processing is performed, and the image is displayed on the display unit 233.

音声再生機２３２は、後述する情報処理装置２１の再生データ出力部１５から出力され、入力された優先データ、予測優先データ、非優先データに基づく再生データについて、Ｄ／Ａ変換及び増幅などの再生処理を行い、スピーカ２３４から音声を出力させる。
優先データ、予測優先データ、非優先データについては後述する。The voice player 232 reproduces the reproduced data such as D / A conversion and amplification with respect to the reproduced data output from the reproduction data output unit 15 of the information processing apparatus 21 described later and based on the input priority data, prediction priority data, and non-priority data. The processing is performed, and the sound is output from the speaker 234.
Priority data, prediction priority data, and non-priority data will be described later.

データの再生処理を、映像再生機２３１で行うか、又は、音声再生機２３２で行うかは、後述する情報処理装置２１の再生時刻／再生方法決定部９で決定される。 Whether the data reproduction process is performed by the video player 231 or the audio player 232 is determined by the playback time / playback method determination unit 9 of the information processing device 21 described later.

表示部２３３は、映像再生機２３１で再生処理された映像を表示する。
表示部２３３は、液晶ディスプレイ、プラズマディスプレイ又はＯＥＬＤ（Organic Electro Luminescence Display）等の表示デバイスにより構成される。表示部２３３には、通信する相手地点の映像、自身の地点の映像、又は、通信相手地点の映像及び自身の地点の映像の両方が表示可能に構成される。The display unit 233 displays the video reproduced by the video player 231.
The display unit 233 is composed of a display device such as a liquid crystal display, a plasma display, or an OELD (Organic Electro Luminescence Display). The display unit 233 is configured to be capable of displaying both the image of the communication partner point, the image of its own point, or the image of the communication partner point and the image of its own point.

例えば、第１の情報処理システム２０Ａの表示部２３３Ａには、第２の情報処理システム２０Ｂにより取得される映像が表示され、第２の情報処理システム２０Ｂの表示部２３３Ｂには、第１の情報処理システム２０Ａにより取得される映像が表示される。 For example, the image acquired by the second information processing system 20B is displayed on the display unit 233A of the first information processing system 20A, and the first information is displayed on the display unit 233B of the second information processing system 20B. The image acquired by the processing system 20A is displayed.

スピーカ２３４は、音声再生機２３２で再生処理された音声を出力する。 The speaker 234 outputs the sound reproduced and processed by the sound player 232.

第１の情報処理システム２０Ａのスピーカ２３４Ａには、第２の情報処理システム２０Ｂにより取得される音声が出力され、第２の情報処理システム２０Ｂのスピーカ２３４Ｂには、第１の情報処理システム２０Ａにより取得される音声が出力される。 The sound acquired by the second information processing system 20B is output to the speaker 234A of the first information processing system 20A, and the sound acquired by the second information processing system 20B is output to the speaker 234B of the second information processing system 20B by the first information processing system 20A. The acquired audio is output.

（情報処理装置の構成）
情報処理装置２１は、センシングデータ取得部１と、データエンコード部２と、コンテキスト認識部３と、優先データ抽出部４と、短期優先データ保存部５と、優先データ予測部６と、通信部７と、優先データ分別部８と、再生時刻／再生方法決定部９と、データデコード部１０と、優先データ再生保存部１１と、再生データ生成部１２と、優先データ再生済み確認部１３と、補間データ生成部１４と、再生データ出力部１５と、を有する。(Configuration of information processing device)
The information processing device 21 includes a sensing data acquisition unit 1, a data encoding unit 2, a context recognition unit 3, a priority data extraction unit 4, a short-term priority data storage unit 5, a priority data prediction unit 6, and a communication unit 7. , Priority data sorting unit 8, reproduction time / reproduction method determination unit 9, data decoding unit 10, priority data reproduction storage unit 11, reproduction data generation unit 12, priority data reproduction completed confirmation unit 13, and interpolation. It has a data generation unit 14 and a reproduction data output unit 15.

取得部としてのセンシングデータ取得部１は、センサ群２２を構成する各種センサで取得されたセンシングデータを取得する。センシングデータには、映像データ、音データ、デプスデータ、赤外線画像データが含まれる。センシングデータは、当該センシングデータを取得した地点に関するデータである。ここでは、センシングデータとして、映像データ、音データ、デプスデータを用いる場合について説明する。 The sensing data acquisition unit 1 as an acquisition unit acquires the sensing data acquired by various sensors constituting the sensor group 22. Sensing data includes video data, sound data, depth data, and infrared image data. The sensing data is data related to the point where the sensing data is acquired. Here, a case where video data, sound data, and depth data are used as sensing data will be described.

センシングデータ取得部１で取得されたセンシングデータは、データエンコード部２、コンテキスト認識部３に出力される。
また、取得されたセンシングデータは、時系列に図示しないセンシングデータ保存部に保存される。The sensing data acquired by the sensing data acquisition unit 1 is output to the data encoding unit 2 and the context recognition unit 3.
Further, the acquired sensing data is stored in a sensing data storage unit (not shown in time series).

エンコード部としてのデータエンコード部２は、センシングデータ取得部１で取得されたセンシングデータをエンコードする。センシングデータ取得部１で取得された全てのデータ（ここでは、映像データ、音データ、デプスデータ）をエンコードする。エンコードされたデータは、通信部７に出力される。エンコードされたデータは非優先データである。 The data encoding unit 2 as the encoding unit encodes the sensing data acquired by the sensing data acquisition unit 1. All the data (here, video data, sound data, depth data) acquired by the sensing data acquisition unit 1 are encoded. The encoded data is output to the communication unit 7. The encoded data is non-preferred data.

データエンコード部２及び後述するデータデコード部１０は、図示しないＡＶコーデックに備えられている。 The data encoding unit 2 and the data decoding unit 10 described later are provided in an AV codec (not shown).

コンテキスト認識部３は、センシングデータ取得部１で取得されたセンシングデータを基に、地点の状況（シーン）を認識する。 The context recognition unit 3 recognizes the situation (scene) of the point based on the sensing data acquired by the sensing data acquisition unit 1.

例えば、コンテキスト認識部３は、地点に、人が複数いるのか、人による会話が行われているのか、人は何をしているのか、地点にある物体が何であるか、物体は動いているのか静止しているのか、物体が動いている場合は何をしているのか、等を認識する。 For example, in the context recognition unit 3, there are a plurality of people at a point, a conversation is being held by a person, what the person is doing, what the object at the point is, and the object is moving. Recognize whether the object is stationary, what it is doing if the object is moving, and so on.

この認識は、センシングデータがそれに対応した分析器に入力され分析されることにより行われる。 This recognition is performed by inputting the sensing data into the corresponding analyzer and analyzing it.

各種センシングデータのうち音データは、分析器によって、音声認識により人の声と環境音とに判別される。これにより、人の声の音データを抽出することができる。 Of the various sensing data, sound data is discriminated into human voice and environmental sound by voice recognition by an analyzer. Thereby, the sound data of the human voice can be extracted.

また、音声認識では、音データに基づいて言語認識を行い、音データに含まれる言葉を単語単位で認識してキーワードが抽出される。抽出されるキーワードとしては、物体の名称等を表す名詞、「ストップ」、「行け」等の指示語、「嬉しい」、「楽しい」等の感情を表す語等がある。このように抽出されたキーワードを用いて地点の状況（シーン）を認識することができる。 Further, in voice recognition, language recognition is performed based on sound data, words included in sound data are recognized word by word, and keywords are extracted. The extracted keywords include nouns representing the names of objects, demonstrative words such as "stop" and "go", and words expressing emotions such as "happy" and "fun". The situation (scene) of the point can be recognized by using the keywords extracted in this way.

音声認識の手法としては、例えば、学習用データから音声の特徴を蓄積し、その特徴と供給された音声とを比較してキーワードを抽出する手法等があり、既知の手法を用いることができる。 As a voice recognition method, for example, there is a method of accumulating voice features from learning data, comparing the features with the supplied voice, and extracting keywords, and a known method can be used.

各種センシングデータのうち映像データは、分析器によって、画素毎に、前に取得されたフレーム画像との画素値の差分が取られる。
各種センシングデータのうちデプスデータは、分析器によって、前に取得されたデプスデータとの差分が取られ、これを基に奥行き方向、縦方向の物体の移動量が求められる。
前に取得されたフレーム画像及びデプスデータには、センシングデータ保存部に保存されたデータを用いる。
これら画素値の差分、物体の移動量から、人物、物体の移動情報を取得することができる。Of the various sensing data, the video data is obtained by the analyzer for each pixel as the difference in pixel value from the previously acquired frame image.
Of the various sensing data, the depth data is different from the previously acquired depth data by the analyzer, and the amount of movement of the object in the depth direction and the vertical direction is obtained based on this difference.
For the previously acquired frame image and depth data, the data stored in the sensing data storage unit is used.
The movement information of a person or an object can be obtained from the difference between these pixel values and the amount of movement of the object.

このような各センシングデータの分析結果を基に、シーンが推定、認識される。 The scene is estimated and recognized based on the analysis result of each of the sensing data.

シーンの例としては、複数の異なる場所にそれぞれ設置された情報処理システム（本実施形態における第１の情報処理システム２０Ａと第２の情報処理システム２０Ｂ）間の通信を用いて、２つの異なる場所それぞれにいる二者間（本実施形態におけるＡさんとＢさんの二者間）で行われる、じゃんけんをしているシーン、テレビ会議をしているシーン、遠隔操作をしているシーン、遠隔指さしをしているシーン等がある。尚、これらのシーンに限定されない。 As an example of the scene, two different places are used by using communication between information processing systems (first information processing system 20A and second information processing system 20B in the present embodiment) installed in a plurality of different places. A scene of information processing, a scene of a video conference, a scene of remote operation, and a remote pointing performed between two people in each (between Mr. A and Mr. B in this embodiment). There are scenes where you are doing information processing. The scene is not limited to these scenes.

更に、コンテキスト認識部３は、認識したシーンに基づいて、二地点間、二者間で同期すべきデータの識別フラグを設定する。二地点間、二者間で同期すべきデータとは、優先的に相手の情報処理装置（送信地点の情報処理装置）に送るべきデータであり、シーンにおいて二者間の円滑なコミュニケーションに重要な情報となるものである。この優先的に送るべきデータかどうかが識別できるように識別フラグが設定される。
優先的に通信相手の情報処理装置に送るべきデータが何であるかはシーンに応じて設定される。具体例については後述する。Further, the context recognition unit 3 sets an identification flag of data to be synchronized between two points and between two parties based on the recognized scene. Data that should be synchronized between two points and between two parties is data that should be sent preferentially to the other party's information processing device (information processing device at the transmission point), which is important for smooth communication between the two parties in the scene. It is informational. An identification flag is set so that it can be identified whether or not the data should be sent with priority.
What data should be preferentially sent to the information processing device of the communication partner is set according to the scene. Specific examples will be described later.

同期すべきデータ（優先的に送るデータ）としては、音データの分析により判別された人の声等の音データ、映像データの分析により求められた画素値、デプスデータの分析により求められた物体の移動量等がある。
具体例については後述する。The data to be synchronized (data to be sent preferentially) include sound data such as human voice determined by analysis of sound data, pixel values obtained by analysis of video data, and objects obtained by analysis of depth data. There is the amount of movement of.
Specific examples will be described later.

コンテキスト認識部３で認識されたシーン名、当該シーン名に対して設定された識別フラグの情報、優先的に送るべきデータの再生時刻は、優先データ抽出部４に出力される。 The scene name recognized by the context recognition unit 3, the information of the identification flag set for the scene name, and the reproduction time of the data to be sent preferentially are output to the priority data extraction unit 4.

優先データ抽出部４は、設定された識別フラグに基づいて、二地点間、二者間で同期すべき情報、すなわち優先的に送信するデータを抽出する。優先データ抽出部４は、優先的に送信するデータ、シーン名、識別フラグ情報及び再生時刻を併せ、互いに紐づけして、優先データとして、短期優先データ保存部５及び通信部７に出力する。 The priority data extraction unit 4 extracts information to be synchronized between two points and two parties, that is, data to be preferentially transmitted, based on the set identification flag. The priority data extraction unit 4 combines the data to be preferentially transmitted, the scene name, the identification flag information, and the reproduction time, associates them with each other, and outputs the priority data to the short-term priority data storage unit 5 and the communication unit 7.

短期優先データ保存部５は、優先データ抽出部４で抽出された優先データを時系列順に短期的に保存する。 The short-term priority data storage unit 5 stores the priority data extracted by the priority data extraction unit 4 in chronological order in a short period of time.

優先データ予測部６は、短期優先データ保存部５で保存された優先データを基に、将来の優先データを予測し、予測優先データを生成する。予測優先データは通信部７に出力される。 The priority data prediction unit 6 predicts future priority data based on the priority data stored in the short-term priority data storage unit 5, and generates prediction priority data. The prediction priority data is output to the communication unit 7.

より具体的には、優先データ予測部６は、短期優先データ保存部５により保存された時系列の優先データを基に、認識されたシーン名で、映像にある人物や物体が連続的な動作をしているかを判定する。
更に、優先データ予測部６は、人物や物体が連続的な動作をしていると判定した場合、短期優先データ保存部５により保存された時系列の優先データから、人物や物体の動作を予測することが可能なフェーズか否かを判定する。
そして、予測することが可能なフェーズであると判定した場合、短期優先データ保存部５に保存された情報を基に、将来の優先データである予測優先データを予測し、生成する。More specifically, the priority data prediction unit 6 continuously operates a person or an object in the video with a recognized scene name based on the time-series priority data saved by the short-term priority data storage unit 5. To determine if you are doing.
Further, when the priority data prediction unit 6 determines that the person or object is continuously moving, the priority data prediction unit 6 predicts the movement of the person or object from the time-series priority data saved by the short-term priority data storage unit 5. Determine if it is a possible phase.
Then, when it is determined that the phase can be predicted, the prediction priority data, which is the future priority data, is predicted and generated based on the information stored in the short-term priority data storage unit 5.

このように優先データ予測部を設けることにより、シーンに応じた重要なデータを優先的に通信相手に提示することが可能となり、二者間でのコミュニケーションをより円滑なものとすることができる。 By providing the priority data prediction unit in this way, it is possible to preferentially present important data according to the scene to the communication partner, and it is possible to facilitate communication between the two parties.

一例として、じゃんけんのシーンでは、出し手の情報が重要となり、出し手の部分の映像データの画素値が優先データとなる。
じゃんけんにおいて、出し手が、「グー」、「チョキ」、「パー」のいずれかの形態を完全にとって出される前の手の形態変化から、出し手がどの形態をとるか、予測がつく。したがって、出し手が完全にだされる前に、短期優先データ保存部５により保存された時系列の優先データを基に、将来の優先データである予測優先データを予測し、生成することが可能となる。As an example, in the rock-paper-scissors scene, the information of the sender is important, and the pixel value of the video data of the sender is the priority data.
In rock-paper-scissors, it is possible to predict which form the issuer will take from the change in the form of the hand before the issuer completely takes one of the forms of "goo", "choki", and "par". Therefore, it is possible to predict and generate predictive priority data, which is future priority data, based on the time-series priority data saved by the short-term priority data storage unit 5 before the issuer is completely released. Become.

予測優先データには、音データ、画素値、物体の移動量等の予測される優先的に送信するデータと、シーン名と、識別フラグと、再生時刻が含まれる。予測優先データはエンコードされずに通信相手の情報処理装置に送信される。 The prediction priority data includes predicted priority transmission data such as sound data, pixel values, and movement amount of an object, a scene name, an identification flag, and a playback time. The prediction priority data is transmitted to the information processing device of the communication partner without being encoded.

通信部７は、通信相手の情報処理装置とデータの送受信を行う。本実施形態では、第１の情報処理装置２０Ａ（第２の情報処理装置２０Ｂ）における通信相手の情報処理装置は第２の情報処理装置２０Ｂ（第１の情報処理装置２０Ａ）である。 The communication unit 7 transmits / receives data to / from the information processing device of the communication partner. In the present embodiment, the information processing device of the communication partner in the first information processing device 20A (second information processing device 20B) is the second information processing device 20B (first information processing device 20A).

通信部７は、自身の情報処理装置で取得したセンシングデータに基づいて得た優先データ及び非優先データを、通信相手の情報処理装置に送信する。更に、通信部７は、通信相手の情報処理装置で取得されたセンシングデータに基づいて得られた優先データ及び非優先データを、受信する。 The communication unit 7 transmits the priority data and the non-priority data obtained based on the sensing data acquired by its own information processing device to the information processing device of the communication partner. Further, the communication unit 7 receives the priority data and the non-priority data obtained based on the sensing data acquired by the information processing device of the communication partner.

このように、優先データは、ＡＶコーデックを通さずにエンコードされずに通信相手の情報処理装置に送信される。これにより、ＡＶコーデックを通してエンコードされて通信相手の情報処理装置に送信される非優先データと比べて、優先コードは、エンコードに要する時間が不要なため、ＡＶコーデックによるエンコード分の遅延がなく、通信相手の情報処理装置に早く送信することが可能となる。
これにより、受信地点の情報処理装置に対して、遅延が減少された映像データや音声データを送信することができ、二者間の円滑なコミュニケーションが可能となる。In this way, the priority data is transmitted to the information processing device of the communication partner without being encoded without passing through the AV codec. As a result, compared to non-priority data encoded through the AV codec and transmitted to the information processing device of the communication partner, the priority code does not require time for encoding, so there is no delay for encoding by the AV codec, and communication is performed. It becomes possible to quickly transmit to the information processing device of the other party.
As a result, video data and audio data with reduced delay can be transmitted to the information processing device at the receiving point, and smooth communication between the two parties becomes possible.

優先データ分別部８は、通信部７で受信した通信相手の情報処理装置からのデータを、優先データ又は予測優先データと、非優先データとに分別する。優先データ及び予測優先データは、エンコードされていないデータである。非優先データはエンコードされているデータである。 The priority data sorting unit 8 separates the data received by the communication unit 7 from the information processing device of the communication partner into priority data or prediction priority data and non-priority data. The priority data and the prediction priority data are unencoded data. Non-priority data is encoded data.

優先データ分別部８は、非優先データをデータデコード部１０に出力する。
優先データ分別部８は、優先データ及び予測優先データを再生時刻／再生方法決定部９に出力する。The priority data sorting unit 8 outputs non-priority data to the data decoding unit 10.
The priority data sorting unit 8 outputs the priority data and the prediction priority data to the reproduction time / reproduction method determination unit 9.

決定部としての再生時刻／再生方法決定部９は、優先データ分別部８にて分別された優先データ（予測優先データ）の再生時刻と、どの再生機で優先データ（予測優先データ）である映像データ（予測映像データ）や音データ（予測音データ）を再生するかを決定する。決定内容は、優先データ再生保存部１１に保存される。また、決定内容は、再生データ生成部１２に出力される。 Playback time / playback method as a determination unit The determination unit 9 determines the reproduction time of the priority data (prediction priority data) sorted by the priority data sorting unit 8 and the image which is the priority data (prediction priority data) in which player. Decide whether to play data (predicted video data) or sound data (predicted sound data). The determined content is stored in the priority data reproduction / storage unit 11. Further, the determined content is output to the reproduction data generation unit 12.

デコード部としてのデータデコード部１０は、優先データ分別部８にて分別された非優先データをデコードする。デコードされた非優先データは優先データ再生済み確認部１３に出力される。 The data decoding unit 10 as the decoding unit decodes the non-priority data sorted by the priority data sorting unit 8. The decoded non-priority data is output to the priority data reproduction completed confirmation unit 13.

保存部としての優先データ再生保存部１１は、再生時刻／再生方法決定部９による決定内容として、優先データの内容と、優先データを用いた再生を行う再生時刻と、再生に用いる再生機の情報を保存する。 Priority data reproduction as a storage unit The storage unit 11 determines the content of the priority data, the reproduction time of performing the reproduction using the priority data, and the information of the reproduction machine used for the reproduction as the determination contents by the reproduction time / reproduction method determination unit 9. To save.

再生データ生成部１２は、再生時刻／再生方法決定部９での決定内容に基づいて、優先データ、予測優先データの再生データを生成する。生成された再生データは再生データ出力部１５に出力される。 The reproduction data generation unit 12 generates reproduction data of priority data and prediction priority data based on the determination contents in the reproduction time / reproduction method determination unit 9. The generated reproduction data is output to the reproduction data output unit 15.

再生済み確認部としての優先データ再生済み確認部１３は、データデコード部１０でデコードされたデータを再生する前に、優先データ再生保存部１１に保存された決定内容を参照して、通信相手の情報処理装置から受信した非優先データをデコードしたデータのうち、優先データを用いた再生で既に再生済みされたものがあるかを確認する。 Priority data as reproduced confirmation unit The reproduced confirmation unit 13 refers to the decision content saved in the priority data reproduction storage unit 11 before reproducing the data decoded by the data decoding unit 10, and the communication partner. Among the decoded data of the non-priority data received from the information processing device, it is confirmed whether or not there is already reproduced by the reproduction using the priority data.

優先データ再生済み確認部１３は、まだ再生されていないデコードされた非優先データを再生データ出力部１５に出力する。また、優先データ再生済み確認部１３は、確認結果を補間データ生成部１４に出力する。 The priority data reproduction completed confirmation unit 13 outputs the decoded non-priority data that has not yet been reproduced to the reproduction data output unit 15. Further, the priority data reproduction completed confirmation unit 13 outputs the confirmation result to the interpolation data generation unit 14.

補間データ生成部１４は、優先データを用いて既に再生されたと確認された優先データとデコードされた非優先データとの間を補間して繋ぎ合わせるための補間データを生成する。生成された補間データは、再生データ出力部１５に出力される。 The interpolation data generation unit 14 generates interpolation data for interpolating and joining the priority data confirmed to have already been reproduced using the priority data and the decoded non-priority data. The generated interpolated data is output to the reproduction data output unit 15.

このように補間データ生成部１４が設けられることにより、映像に写し出される人等の動きの流れに違和感が少ない映像を表示させることができ、また出力される人の声の流れに違和感が少ない音声を出力させることができる。 By providing the interpolation data generation unit 14 in this way, it is possible to display an image with less discomfort in the flow of movement of a person or the like projected on the image, and a voice with less discomfort in the flow of voice of the output person. Can be output.

出力部としての再生データ出力部１５は、再生データ生成部１２からの再生データの入力、優先データ再生済み確認部１３からのデコードされたデータの入力、補間データ生成部１４からの補間データの入力をうけて、再生データを再生部２３に出力する。 The reproduction data output unit 15 as an output unit inputs the reproduction data from the reproduction data generation unit 12, inputs the decoded data from the priority data reproduction completion confirmation unit 13, and inputs the interpolation data from the interpolation data generation unit 14. Is received, and the reproduced data is output to the reproduction unit 23.

受信地点にある情報処理装置２１においては、優先データはエンコードされていないためＡＶコーデックを通してデコードする必要がないため、ＡＶコーデックによるデコード分の遅延がなく、非優先データよりも早く再生することができる。
これにより、送信地点にある情報処理装置２１から送信されたデータの再生が、遅延が減少されて行なわれ得るので、二者間のより円滑なコミュニケーションが可能となる。In the information processing device 21 at the receiving point, since the priority data is not encoded, it is not necessary to decode it through the AV codec, so that there is no delay for decoding by the AV codec and the data can be reproduced faster than the non-priority data. ..
As a result, the data transmitted from the information processing device 21 at the transmission point can be reproduced with a reduced delay, so that smoother communication between the two parties becomes possible.

（情報処理方法）
次に、上述の情報処理システム５０が実行する情報処理方法について、送信側と受信側に分け、図２、図３を用いて説明する。
尚、ここでは、便宜的に、地点Ａを送信地点とし、地点Ｂを受信地点として説明するが、当然のことながら、地点Ａが受信地点、地点Ｂが送信地点であってもよく、このような場合においても同様の処理が行われる。以下、必要に応じ、図１に図示される構成を用いて説明する。(Information processing method)
Next, the information processing method executed by the above-mentioned information processing system 50 will be described separately for the transmitting side and the receiving side with reference to FIGS. 2 and 3.
Here, for convenience, the point A will be the transmission point and the point B will be the reception point. However, as a matter of course, the point A may be the reception point and the point B may be the transmission point. In such a case, the same processing is performed. Hereinafter, the configuration shown in FIG. 1 will be described as necessary.

[送信地点にある情報処理システムでの動作フロー]
図２は、送信側の情報処理システム（ここでは、第１の情報処理システム２０Ａ）での遅延制御に係る情報処理方法のフロー図である。以下、図２に従って、送信側の情報処理システムにおける情報処理方法について説明する。[Operation flow in the information processing system at the transmission point]
FIG. 2 is a flow chart of an information processing method related to delay control in the information processing system on the transmitting side (here, the first information processing system 20A). Hereinafter, an information processing method in the information processing system on the transmitting side will be described with reference to FIG.

図２に示すように、第１の情報処理装置２１Ａのセンシングデータ取得部１Ａにより、地点Ａに関するデータとして、センサ群２２Ａの各種センサで取得されたセンシングデータが取得される（Ｓ１）。本実施形態では、センシングデータには、音データ、映像データ、デプスデータが含まれる。 As shown in FIG. 2, the sensing data acquisition unit 1A of the first information processing device 21A acquires the sensing data acquired by the various sensors of the sensor group 22A as the data related to the point A (S1). In the present embodiment, the sensing data includes sound data, video data, and depth data.

データエンコード部２Ａにより、センシングデータ取得部１Ａで取得されたセンシングデータ(音データ、映像データ、デプスデータ)は、センシングデータ取得時刻の同期をとったうえで、汎用コーデック処理されエンコードされる（Ｓ２）。 The sensing data (sound data, video data, depth data) acquired by the sensing data acquisition unit 1A by the data encoding unit 2A is processed and encoded by a general-purpose codec after synchronizing the sensing data acquisition time (S2). ).

コーデック処理では、音データ、映像データ、デプスデータに対して、処理時間の短いコーデック処理を施す。例えば、音データ及び映像データに対して、リアルタイム通信向けで処理時間の短いＶＰ９によるエンコードを施す。 In the codec processing, the sound data, the video data, and the depth data are subjected to the codec processing having a short processing time. For example, sound data and video data are encoded by VP9, which is for real-time communication and has a short processing time.

エンコードデータ、すなわち非優先データは、通信部７Ａを介して、第２の情報処理装置２１Ｂに送信される（Ｓ１２）。 The encoded data, that is, the non-priority data is transmitted to the second information processing device 21B via the communication unit 7A (S12).

コンテキスト認識部３Ａにより、取得された音データの音声認識が行われる（Ｓ３）。音声認識では、人の音声と環境音とが判別される。 The context recognition unit 3A performs voice recognition of the acquired sound data (S3). In voice recognition, human voice and environmental sound are discriminated.

コンテキスト認識部３Ａにより、取得された映像データを用いて、画素値が求められ、更に、画素毎に前フレームとの画素値の差分が計算される（Ｓ４）。 The context recognition unit 3A obtains a pixel value using the acquired video data, and further calculates the difference in pixel value from the previous frame for each pixel (S4).

コンテキスト認識部３Ａにより、取得されたデプスデータを用いて、前フレームのデプス情報との差分が取られ、奥行き方向、縦方向の物体の移動量が求められる（Ｓ５）。
これら画素値の差分結果、物体の移動量から、人物や物体の動き情報を取得することができる。Using the acquired depth data, the context recognition unit 3A takes a difference from the depth information of the previous frame, and obtains the amount of movement of the object in the depth direction and the vertical direction (S5).
It is possible to acquire motion information of a person or an object from the difference result of these pixel values and the amount of movement of the object.

コンテキスト認識部３Ａにより、音声認識結果、画素値の差分結果、物体の移動量結果に基づいて、シーンが認識される（Ｓ６）。
次に、コンテキスト認識部３Ａにより、認識されたシーンに基づき、優先的に送信されるデータが識別可能にデータに対して識別フラグが設定される（Ｓ７）。The context recognition unit 3A recognizes the scene based on the voice recognition result, the pixel value difference result, and the movement amount result of the object (S6).
Next, the context recognition unit 3A sets an identification flag for the data so that the data to be preferentially transmitted can be identified based on the recognized scene (S7).

次に、優先データ抽出部４Ａにより、設定された識別フラグに基づいて、音データ、画素値、物体の移動量等から、優先的に第２の情報処理装置２１Ｂに送信されるデータが抽出される。抽出されたデータは、シーン名、識別フラグ情報、再生時刻と併せて優先データとして抽出される（Ｓ８）。 Next, the priority data extraction unit 4A extracts data to be preferentially transmitted to the second information processing device 21B from the sound data, the pixel value, the movement amount of the object, and the like, based on the set identification flag. NS. The extracted data is extracted as priority data together with the scene name, identification flag information, and playback time (S8).

次に、抽出された優先データは、短期優先データ保存部５Ａに書き込まれ保存される（Ｓ９）。 Next, the extracted priority data is written and stored in the short-term priority data storage unit 5A (S9).

次に、優先データ予測部６により、短期優先データ保存部５により保存された時系列の優先データを基に、認識されたシーン名で、人物や物体が連続的な動作をしていて、かつ、短期優先データ保存部５により保存された時系列の優先データから、人物や物体の動作を予測することが可能なフェーズであるかが判定される（Ｓ１０）。 Next, a person or an object is continuously moving with a recognized scene name based on the time-series priority data saved by the priority data prediction unit 6 and the short-term priority data storage unit 5. From the time-series priority data saved by the short-term priority data storage unit 5, it is determined whether or not the phase is such that the movement of a person or an object can be predicted (S10).

Ｓ１０でＮｏと判定されるとＳ１２に進む。優先データは、通信部７Ａを介して、第２の情報処理装置２１Ｂに送信される（Ｓ１２）。 If No is determined in S10, the process proceeds to S12. The priority data is transmitted to the second information processing device 21B via the communication unit 7A (S12).

Ｓ１０で、人物や物体が連続的な動作をしていて、かつ、人物や物体の動作を予測することが可能なフェーズであると判定されると（Ｙｅｓ）、Ｓ１１に進む。 If it is determined in S10 that the person or object is continuously moving and the phase is such that the movement of the person or object can be predicted (Yes), the process proceeds to S11.

Ｓ１１では、短期優先データ保存部５に保存された情報を基に予測優先データが生成される。生成された予測優先データと、優先データは、通信部７Ａを介して、第２の情報処理装置２１Ｂに送信される（Ｓ１２）。 In S11, the prediction priority data is generated based on the information stored in the short-term priority data storage unit 5. The generated prediction priority data and the priority data are transmitted to the second information processing device 21B via the communication unit 7A (S12).

[受信地点にある情報処理システムでの動作フロー]
図３は、受信側の情報処理システム（ここでは、第２の情報処理システム２０Ｂ）での遅延制御に係る情報処理方法のフロー図である。以下、図３に従って、受信側の情報処理システムにおける情報処理方法について説明する。[Operation flow in the information processing system at the receiving point]
FIG. 3 is a flow chart of an information processing method related to delay control in the information processing system on the receiving side (here, the second information processing system 20B). Hereinafter, an information processing method in the information processing system on the receiving side will be described with reference to FIG.

図３に示すように、第２の情報処理装置２１Ｂの通信部７Ｂにより、ネットワーク３０を介して、第１の情報処理装置２１Ａから優先データ、非優先データ、予測優先データが受信される（Ｓ３１）。 As shown in FIG. 3, the communication unit 7B of the second information processing device 21B receives priority data, non-priority data, and prediction priority data from the first information processing device 21A via the network 30 (S31). ).

次に、優先データ分別部８Ｂにより、受信したデータが、優先データ又は予測優先データであるか否かが判定される（Ｓ３２）。
Ｓ３２で、優先データ又は予測優先データでない、すなわち、非優先データであると判定されると（Ｎｏ）、Ｓ３３に進む。
Ｓ３２で、優先データ又は予測優先データであると判定されると（Ｙｅｓ）、Ｓ３８に進む。Next, the priority data sorting unit 8B determines whether or not the received data is priority data or prediction priority data (S32).
If it is determined in S32 that the data is not priority data or prediction priority data, that is, it is non-priority data (No), the process proceeds to S33.
If it is determined in S32 that the data is priority data or prediction priority data (Yes), the process proceeds to S38.

Ｓ３８では、再生時刻／再生方法決定部９により、再生する優先データ又は予測優先データの再生時刻及び再生方法が決定される。再生方法とは、どの再生機を用いて再生するかを示す。 In S38, the reproduction time / reproduction method determination unit 9 determines the reproduction time and the reproduction method of the priority data to be reproduced or the predicted priority data. The reproduction method indicates which reproduction machine is used for reproduction.

再生時刻及び再生方法は、優先データ再生保存部１１Ｂに保存され、優先データ又は予測優先データの再生情報が保存される（Ｓ３９）。 The reproduction time and the reproduction method are stored in the priority data reproduction storage unit 11B, and the reproduction information of the priority data or the prediction priority data is stored (S39).

次に、再生データ生成部１２Ｂにより、決定された再生方法に従って、優先データ又は予測優先データを用いて再生データが生成される（Ｓ４０）。生成された再生データは再生データ出力部１５Ｂに出力され、Ｓ３６へ進む。 Next, the reproduction data generation unit 12B generates reproduction data using the priority data or the prediction priority data according to the determined reproduction method (S40). The generated reproduction data is output to the reproduction data output unit 15B, and the process proceeds to S36.

Ｓ３３では、データデコード部１０Ｂにより、エンコードされたデータである非優先データがデコードされる。 In S33, the data decoding unit 10B decodes the non-priority data which is the encoded data.

次に、優先データ再生済み確認部１３Ｂにより、優先データ再生保存部１１Ｂに保存されたデータが参照され、再生時刻がキーとされて、デコードされたデータに含まれる内容が、優先データを用いた再生で既に再生されているかが確認される（Ｓ３４）。 Next, the priority data reproduction completion confirmation unit 13B refers to the data stored in the priority data reproduction storage unit 11B, the reproduction time is used as a key, and the content included in the decoded data uses the priority data. It is confirmed by the reproduction whether it has already been reproduced (S34).

Ｓ３４で再生されていないと確認されると（Ｎｏ）、デコードされたデータは、再生データ出力部１５Ｂに出力され、Ｓ３６へ進む。 If it is confirmed that the data has not been reproduced in S34 (No), the decoded data is output to the reproduction data output unit 15B, and the process proceeds to S36.

Ｓ３４で再生されていると確認されると（Ｙｅｓ）、補間データ生成部１４Ｂにより、先行して行われる優先データによる再生とデコードされたデータによる再生が旨くつながるように補間データが生成される（Ｓ３５）。生成された補間データは、再生データ出力部１５Ｂに出力され、Ｓ３６へ進む。 When it is confirmed that the data is being reproduced in S34 (Yes), the interpolation data generation unit 14B generates the interpolation data so that the reproduction by the priority data performed in advance and the reproduction by the decoded data are successfully connected (yes). S35). The generated interpolated data is output to the reproduction data output unit 15B, and proceeds to S36.

Ｓ３６では、再生データ出力部１５Ｂにより、データの再生時刻に従ってデータのソートが行なわれたうえで、順に、決定された再生機（映像再生機２３１Ｂ又は音声再生機２３２Ｂ）にデータが出力される。優先的に別送された優先データは、決定された再生時刻に従って、デコードされている非優先データに重畳されて出力データとして出力される。 In S36, the reproduction data output unit 15B sorts the data according to the reproduction time of the data, and then outputs the data to the determined player (video player 231B or audio player 232B) in order. The preferentially separately transmitted priority data is superimposed on the decoded non-priority data and output as output data according to the determined playback time.

具体例として、じゃんけんのシーンの場合、デコードされている映像データの手の部分に、優先データである出し手の部分の映像データを重畳されたデータが出力される。 As a specific example, in the case of a rock-paper-scissors scene, data in which the video data of the output part, which is the priority data, is superimposed on the hand part of the decoded video data is output.

映像再生機２３１Ｂ、音声再生機２３２Ｂでは入力されたデータに基づいて再生処理が行われ（Ｓ３７）、表示部２３３Ｂに映像が表示され、スピーカ２３４Ｂから音声が出力される。
じゃんけんのシーンの場合、デコードされた映像データの手の部分に、優先データである出し手の部分の映像データを重畳された映像が表示部２３３Ｂに表示される。The video player 231B and the audio player 232B perform playback processing based on the input data (S37), the video is displayed on the display unit 233B, and the audio is output from the speaker 234B.
In the case of a rock-paper-scissors scene, a video in which the video data of the output part, which is the priority data, is superimposed on the hand part of the decoded video data is displayed on the display unit 233B.

（遅延制御に係る情報処理方法の具体例）
次に、遅延制御に係る情報処理方法の一例として、異なる地点にいるＡさんとＢさんがじゃんけんをしているシーンが認識された場合について図４及び図５を用いて説明する。(Specific example of information processing method related to delay control)
Next, as an example of the information processing method related to delay control, a case where a scene in which Mr. A and Mr. B at different points are playing rock-paper-scissors is recognized will be described with reference to FIGS. 4 and 5.

「じゃんけん」は、手だけを使う遊戯である。じゃんけんは、３種類の指の出し方で三すくみを構成し、勝敗を決める手段である。「じゃんけん」は、例えば英語圏では、Rock-paper-scissorsと呼ばれている。 "Rock-paper-scissors" is a game that uses only hands. Rock-paper-scissors is a means of deciding whether to win or lose by composing a trilemma with three types of fingering. "Rock-paper-scissors" is called Rock-paper-scissors in English-speaking countries, for example.

日本では、３種類の指の出し方として、一般的には、五本指を全て握って握り拳の形態をとる「グー」と、人差し指と中指を伸ばし、それ以外の指は握る形態をとる「チョキ」と、五本指全てを伸ばす形態をとる「パー」とがある。
「グー」は、Rock-paper-scissorsのRockに相当する。「チョキ」は、Rock-paper-scissorsのscissorsに相当する。「パー」は、Rock-paper-scissorsのpaperに相当する。In Japan, there are three types of fingering, generally, "Goo", which takes the form of a fist by grasping all five fingers, and "Goo", which takes the form of extending the index finger and middle finger and grasping the other fingers. There are "choki" and "par" that stretches all five fingers.
"Goo" corresponds to Rock of Rock-paper-scissors. "Rock-paper-scissors" is equivalent to rock-paper-scissors scissors. "Par" corresponds to Rock-paper-scissors paper.

「じゃんけん」では、「グー」は「チョキ」に勝つが「パー」には負け、「チョキ」は「パー」には勝つが「グー」には負け、「パー」は「グー」には勝つが「チョキ」には負ける。 In "rock-paper-scissors", "goo" beats "choki" but loses to "par", "choki" wins to "par" but loses to "goo", and "par" wins to "goo". However, he loses to "Choki".

日本では、じゃんけんを行うときのかけ声として、「さいしょはグー、じゃんけんぽん」という決まり文句が用いられることが多い。ここでは、このかけ声を用いる場合を例にあげて遅延制御に係る情報処理方法について説明する。 In Japan, the cliché "Saisho is goo, rock-paper-scissors" is often used as a shout when playing rock-paper-scissors. Here, an information processing method related to delay control will be described by taking the case of using this shout as an example.

「さいしょはグー、じゃんけんぽん」のかけ声を用いるじゃんけんでは、じゃんけんのルールとして、「さいしょはグー」の「グー」の声が発せられると同時に、じゃんけんプレーヤーは全員、「グー」の形態の手を出すことになっている。
そして、「さいしょはぐー」に続いて、「じゃんけんぽん」のかけ声が発せられ、「じゃんけんぽん」の「ぽん」の声が発せられると同時に、じゃんけんプレーヤーは全員、勝敗を決めるために、「グー」、「チョキ」、「パー」のいずれかの形態にした手をだす。In rock-paper-scissors, which uses the shout of "saisho wa goo, rock-paper-scissors", as a rule of rock-paper-scissors, the "goo" voice of "saisho wa goo" is uttered, and at the same time, all the rock-paper-scissors players are in the form of "goo". I'm supposed to get my hands on it.
Then, following "Saisho Hagu", a shout of "rock-paper-scissors" is made, and at the same time, a voice of "rock-paper-scissors" of "rock-paper-scissors" is made. Put out a hand in the form of "goo", "choki", or "par".

図４は、本技術の一実施形態に係る情報処理方法での遅延制御に係る情報処理方法の具体例を説明する図である。図４に示す例では、優先的に送信される、エンコードされていない優先データがある場合を示す。 FIG. 4 is a diagram illustrating a specific example of an information processing method related to delay control in the information processing method according to the embodiment of the present technology. In the example shown in FIG. 4, there is a case where there is unencoded priority data to be transmitted preferentially.

図５は、比較例に係る情報処理方法での遅延制御に係る情報処理方法の具体例を説明する図である。図５に示す例は、優先的に送信される優先データがなく、通信相手の情報処理装置に対して送信されるデータがエンコードされたデータである非優先データのみである場合を示す。 FIG. 5 is a diagram illustrating a specific example of the information processing method related to delay control in the information processing method according to the comparative example. The example shown in FIG. 5 shows a case where there is no priority data to be preferentially transmitted and only non-priority data which is encoded data is transmitted to the information processing apparatus of the communication partner.

図４及び図５に示す例では、Ａさんがかけ声をかけ、そのかけ声にあわせてＡさんとＢさんは互いの動作にあわせて、グー、チョキ、パーのいずれかの形態にした手をだす、とする。 In the examples shown in FIGS. 4 and 5, Mr. A calls out, and in response to the call, Mr. A and Mr. B put out their hands in the form of goo, choki, or par according to each other's movements. , And.

まず、図５を用いて比較例について説明する。
図５（Ａ）〜（Ｄ）は、それぞれ、撮影時のユーザ又は再生映像の経時変化を複数のコマで表したものである。図５において、「グー」の出し手を出した時の最初のコマにおける出し手を鎖線の楕円で囲んでいる。First, a comparative example will be described with reference to FIG.
5 (A) to 5 (D) show the time-dependent changes of the user or the reproduced video at the time of shooting in a plurality of frames, respectively. In FIG. 5, the mover in the first frame when the mover of "Goo" is put out is surrounded by an ellipse of a chain line.

図５（Ａ）は、第１の情報処理システム２０ＡでＡさんを撮影している時のＡさんの動作の経時変化を示す。第１の情報処理システム２０Ａで取得されたＡさんの映像データ、音声データ及びデプスデータはエンコードされて、第２の情報処理システム２０Ｂに送られる。 FIG. 5A shows the time course of Mr. A's movement when Mr. A is photographed by the first information processing system 20A. The video data, audio data, and depth data of Mr. A acquired by the first information processing system 20A are encoded and sent to the second information processing system 20B.

第２の情報処理システム２０Ｂでは、第１の情報処理システム２０Ａから送られてきたデータがデコードされて、映像及び音声が再生される。
図５（Ｂ）は、第１の情報処理システム２０Ａから送られてきたデータに基づいて、第２の情報処理システム２０Ｂの表示部２３３Ｂに表示される再生映像の経時変化を示す。この再生映像は、地点Ａの映像である。In the second information processing system 20B, the data sent from the first information processing system 20A is decoded, and the video and audio are reproduced.
FIG. 5B shows the time course of the reproduced video displayed on the display unit 233B of the second information processing system 20B based on the data sent from the first information processing system 20A. This reproduced video is a video of point A.

図５（Ａ）及び（Ｂ）に示すように、伝送遅延により、第１の情報処理システム２０Ａで撮影される撮影時よりもやや遅延して、第２の情報処理システム２０Ｂ側で映像が再生される。図に示す例では、第２の情報処理システム２０Ｂで表示される再生映像で「グー」がだされるタイミングは、第１の情報処理システム２０Ａで撮影するときよりも時間的に１コマ分遅くなっている。 As shown in FIGS. 5A and 5B, due to the transmission delay, the image is reproduced on the second information processing system 20B side with a slight delay from the time of shooting taken by the first information processing system 20A. Will be done. In the example shown in the figure, the timing at which "goo" is output in the reproduced video displayed by the second information processing system 20B is one frame later than that when shooting with the first information processing system 20A. It has become.

第２の情報処理システム２０Ｂ側にいるユーザであるＢさんは、この図５（Ｂ）に示されるように再生された映像をみて、じゃんけんを行うことになる。 Mr. B, who is a user on the second information processing system 20B side, sees the reproduced video as shown in FIG. 5B and plays rock-paper-scissors.

図５（Ｃ）は、第２の情報処理システム２０Ｂで、図５（Ｂ）に示されるように再生された映像をみてじゃんけんを行っているＢさんを撮影している時のＢさんの動作の経時変化を示す。 FIG. 5 (C) shows the operation of Mr. B when the second information processing system 20B is shooting Mr. B who is playing rock-paper-scissors by watching the reproduced video as shown in FIG. 5 (B). Shows the change over time.

Ｂさんは、図５（Ｂ）に示される再生映像をみてＡさんの「さいしょはグー」の「グー」のときに「グー」を出すタイミングで、図５（Ｃ）に示すように「グー」を出す。図５（Ａ）及び図５（Ｃ）に示すように、Ａさんの「グー」を出すタイミングとＢさんの「グー」を出すタイミングとは、同期ずれが生じることになる。 Looking at the playback video shown in Fig. 5 (B), Mr. B sees the "Goo" when Mr. A's "Saisho wa Goo" is "Goo", and as shown in Fig. 5 (C), " Goo "is issued. As shown in FIGS. 5 (A) and 5 (C), the timing of issuing Mr. A's "goo" and the timing of issuing Mr. B's "goo" are out of sync.

第２の情報処理システム２０Ｂで取得されたＢさんの映像データ、音声データ及びデプスデータはエンコードされて、第１の情報処理システム２０Ａに送られる。 The video data, audio data, and depth data of Mr. B acquired by the second information processing system 20B are encoded and sent to the first information processing system 20A.

第１の情報処理システム２０Ａでは、第２の情報処理システム２０Ｂから送られてきたデータがデコードされて、映像及び音声が再生される。
図５（Ｄ）は、第２の情報処理システム２０Ｂから送られてきたデータに基づいて、第１の情報処理システム２０Ａの表示部２３３Ａに表示される再生映像の経時変化を示す。In the first information processing system 20A, the data sent from the second information processing system 20B is decoded, and the video and audio are reproduced.
FIG. 5D shows the time course of the reproduced video displayed on the display unit 233A of the first information processing system 20A based on the data sent from the second information processing system 20B.

図５（Ｃ）及び（Ｄ）に示すように、伝送遅延により、第２の情報処理システム２０Ｂで撮影される撮影時よりも遅延して、第１の情報処理システム２０Ａの表示部２３３Ａで地点Ｂの映像が再生される。図に示す例では、地点Ａの表示部２３３Ａに再生されて表示される地点Ｂの様子の映像で「グー」がだされるタイミングは、第１の情報処理システム２０Ａ側で撮影したとき（図５（Ａ）参照。）よりも時間的に３コマ分遅くなっている。 As shown in FIGS. 5C and 5D, due to the transmission delay, the point on the display unit 233A of the first information processing system 20A is delayed from the time of shooting taken by the second information processing system 20B. The image of B is played back. In the example shown in the figure, the timing at which "goo" is displayed in the image of the state of the point B reproduced and displayed on the display unit 233A of the point A is when the first information processing system 20A takes a picture (FIG. It is 3 frames later than 5 (A).).

第１の情報処理システム２０Ａ側のユーザであるＡさんは、この図５（Ｄ）に示される再生映像をみて、じゃんけんを行うことになる。 Mr. A, who is the user of the first information processing system 20A, will play rock-paper-scissors by watching the reproduced video shown in FIG. 5 (D).

つまり、Ａさんは、Ｂさんの「さいしょはグー」のかけ声の「グー」にあわせて出すグーの手の映像を表示部２３３Ａで確認して、次のかけ声である「じゃんけんぽん」を発することになる。 In other words, Mr. A confirms the image of Goo's hand to be put out in accordance with Mr. B's "Saisho wa Goo" shout "Goo" on the display unit 233A, and emits the next shout "Jankenpon". It will be.

これにより、図５（Ａ）に示すように、Ａさんには、「さいしょはグー」のかけ声から、次の「じゃんけんぽん」のかけ声を発するまでに、３コマ分の待ち時間が生じることになる。 As a result, as shown in FIG. 5 (A), Mr. A has to wait for three frames from the shout of "Saisho wa Goo" to the next shout of "Rock-paper-scissors". become.

これに対し、本実施形態に係る図４に示す遅延制御が実行される情報処理方法では、図４（Ａ）に示すように、Ａさんの待ち時間が１コマ分となり、待ち時間が比較例と比較して短くなっている。 On the other hand, in the information processing method in which the delay control shown in FIG. 4 according to the present embodiment is executed, as shown in FIG. 4 (A), the waiting time of Mr. A is one frame, and the waiting time is a comparative example. It is shorter than.

以下、図４を用いて説明する。図４（Ａ）〜（Ｄ）は、それぞれ、撮影時のユーザ又は再生映像の経時変化を複数のコマで表したものである。図４において、「グー」の出し手を出した時の最初のコマにおける出し手を鎖線の楕円で囲んでいる。 Hereinafter, it will be described with reference to FIG. 4 (A) to 4 (D) show the time-dependent changes of the user or the reproduced video at the time of shooting in a plurality of frames, respectively. In FIG. 4, the mover in the first frame when the mover of "Goo" is put out is surrounded by an ellipse of a chain line.

ここで説明する一例では、センシングデータ取得部１で取得されたセンシングデータを基に、情報処理装置２１によりユーザの状況であるシーンがじゃんけんをしているシーンであると既に認識されているものとして説明する。 In the example described here, it is assumed that the information processing device 21 has already recognized that the scene of the user's situation is a rock-paper-scissors scene based on the sensing data acquired by the sensing data acquisition unit 1. explain.

情報処理装置２１では、認識されたシーンに基づいて、相手の情報処理装置に優先的に送るべきデータ（優先データ）が識別できるように、データに識別フラグが設定される。じゃんけんのシーンにおいては、映像データのうち「グー」「チョキ」「パー」の形態をとる出し手の部分の映像データが、優先データとなるように識別フラグが設定される。 In the information processing device 21, an identification flag is set for the data so that the data (priority data) to be preferentially sent to the other information processing device can be identified based on the recognized scene. In the rock-paper-scissors scene, the identification flag is set so that the video data of the part of the video data in the form of "goo", "choki", or "par" becomes the priority data.

センシングデータである映像データ、音データ、デプスデータはエンコードされて非優先データとして送信される。
また、エンコードデータ（非優先データ）とは別に、じゃんけんのシーンでは、優先データとして、出し手の部分の映像データの画素値がエンコードされずに、通信相手の情報処理装置に送信される。すなわち、じゃんけんのシーンでは、出し手の映像情報が重要なため、出し手の映像データが優先データとなる。優先データは、遅延するとＡさんとＢさんとの間のじゃんけんでのコミュニケーションがうまくいかない情報に相当する。Video data, sound data, and depth data, which are sensing data, are encoded and transmitted as non-priority data.
In addition to the encoded data (non-priority data), in the rock-paper-scissors scene, the pixel value of the video data of the sender is not encoded and is transmitted to the information processing device of the communication partner as the priority data. That is, in the rock-paper-scissors scene, the video information of the sender is important, so the video data of the sender is the priority data. Priority data corresponds to information that if delayed, communication between Mr. A and Mr. B in rock-paper-scissors will not be successful.

情報処理システム５０において、優先データは、エンコード及びデコードされることなく、通信相手の情報処理装置の再生データ出力部へ出力されることになるため、ＡＶコーデックでのエンコード及びデコード分の遅延がなく、非優先データよりも早く通信相手に提示することが可能となる。 In the information processing system 50, the priority data is output to the reproduction data output unit of the information processing device of the communication partner without being encoded and decoded, so that there is no delay for encoding and decoding by the AV codec. , It is possible to present to the communication partner faster than non-priority data.

本実施形態では、じゃんけんのシーンであると認識されている場合を例にあげているので、じゃんけんプレーヤーの出し手の部分の映像データ（画素値）が優先的に別送される。そして、決定された再生時刻に従って、エンコードして送信されデコードされている映像データに、優先的に別送された出し手の部分の映像を重畳させた出力データが生成される。 In the present embodiment, since the case where the scene is recognized as a rock-paper-scissors scene is taken as an example, the video data (pixel value) of the part of the rock-paper-scissors player is preferentially sent separately. Then, according to the determined playback time, output data is generated in which the video data of the output part that is preferentially sent separately is superimposed on the video data that has been encoded, transmitted, and decoded.

図４に示す例は、本技術の効果をわかりやすくするために、Ｂさんの映像がＡさんに送られてくるときに、本技術に係る遅延制御が実行される場合を例にあげている。 In the example shown in FIG. 4, in order to make it easier to understand the effect of the present technology, a case where the delay control related to the present technology is executed when the video of Mr. B is sent to Mr. A is given as an example. ..

図４（Ａ）は、第１の情報処理システム２０Ａで、Ａさんを撮影している時のＡさんの動作の経時変化を示す。第１の情報処理システム２０Ａで取得された映像データ、音データ、デプスデータはエンコードされ第２の情報処理システム２０Ｂに送られる。 FIG. 4A shows the time course of the movement of Mr. A when the first information processing system 20A is photographing Mr. A. The video data, sound data, and depth data acquired by the first information processing system 20A are encoded and sent to the second information processing system 20B.

第２の情報処理システム２０Ｂでは、第１の情報処理システム２０Ａより送られてきたデータを基に、映像及び音声が再生される。
図４（Ｂ）は、表示部２３３Ｂに表示される再生映像の経時変化を表す。In the second information processing system 20B, video and audio are reproduced based on the data sent from the first information processing system 20A.
FIG. 4B shows the time course of the reproduced video displayed on the display unit 233B.

図４（Ａ）及び（Ｂ）に示すように、伝送遅延により、第１の情報処理システム２０Ａで撮影される撮影時よりもやや遅延して、第２の情報処理システム２０Ｂ側で映像が再生される。図に示す例では、第２の情報処理システム２０Ｂ側での再生映像で「グー」がだされるタイミングは、第１の情報処理システム２０Ａで撮影するときよりも時間的に１コマ分遅くなっている。 As shown in FIGS. 4A and 4B, due to the transmission delay, the video is reproduced on the second information processing system 20B side with a slight delay from the time of shooting taken by the first information processing system 20A. Will be done. In the example shown in the figure, the timing at which "goo" is output in the reproduced video on the second information processing system 20B side is delayed by one frame in time compared to when shooting with the first information processing system 20A. ing.

第２の情報処理システム２０Ｂ側にいるユーザであるＢさんは、この図４（Ｂ）に示されるように再生された映像をみて、じゃんけんを行うことになる。 Mr. B, who is a user on the second information processing system 20B side, sees the reproduced video as shown in FIG. 4B and plays rock-paper-scissors.

図４（Ｃ）は、第２の情報処理システム２０Ｂで、図４（Ｂ）に示される再生映像をみてじゃんけんを行っているＢさんを撮影している時のＢさんの動作の経時変化を示す。 FIG. 4 (C) shows the time course of the movement of Mr. B when the second information processing system 20B is taking a picture of Mr. B who is playing rock-paper-scissors by watching the reproduced video shown in FIG. 4 (B). show.

Ｂさんは、図４（Ｂ）に示される再生映像をみてＡさんの「さいしょはグー」の「グー」のときに「グー」を出すタイミングで、図４（Ｃ）に示すように「グー」を出す。図４（Ａ）及び図４（Ｃ）に示すように、Ａさんの「グー」を出すタイミングとＢさんの「グー」を出すタイミングとは、同期ずれが生じることになる。 Looking at the playback video shown in Fig. 4 (B), Mr. B sees the "Goo" when Mr. A's "Saisho wa Goo" is "Goo", and as shown in Fig. 4 (C), " Goo "is issued. As shown in FIGS. 4 (A) and 4 (C), the timing of issuing Mr. A's "goo" and the timing of issuing Mr. B's "goo" are out of sync.

ここでは、既にじゃんけんのシーンであると認識されているので、出し手の部分の部分映像の画素値が優先データとなるように識別フラグが設定されている。
第２の情報処理システム２０Ｂで、識別フラグに基づいて、映像データからＢさんの出し手の部分の部分映像データ（画素値）が優先データとして抽出される。抽出された優先データはエンコードされずに優先データとして優先的に第１の情報処理システム２０Ａに送られる。Here, since it is already recognized as a rock-paper-scissors scene, the identification flag is set so that the pixel value of the partial image of the sender part becomes the priority data.
In the second information processing system 20B, the partial video data (pixel value) of the part of Mr. B's sender is extracted as priority data from the video data based on the identification flag. The extracted priority data is preferentially sent to the first information processing system 20A as priority data without being encoded.

また、優先データ予測部６Ｂにより予測された予測優先データがある場合、この予測優先データも、エンコードされずに優先的に第１の情報処理システム２０Ａに送られる。ここでは、予測優先データは、予測される出し手の部分の部分映像データ（画素値）である。 Further, when there is prediction priority data predicted by the priority data prediction unit 6B, this prediction priority data is also preferentially sent to the first information processing system 20A without being encoded. Here, the prediction priority data is the partial video data (pixel value) of the predicted output part.

第１の情報処理システム２０Ａでは、第２の情報処理システム２０Ｂより送られてきた優先データ又は予測優先データである出し手の映像データ（画素値）の再生時刻及び再生方法が決定され、これに基づいて優先データである出し手の映像データの再生データが生成される。 In the first information processing system 20A, the reproduction time and reproduction method of the video data (pixel value) of the sender, which is the priority data or the prediction priority data sent from the second information processing system 20B, are determined, and based on this. The playback data of the video data of the sender, which is the priority data, is generated.

また、予測優先データがある場合においても、送られてきた予測優先データである予測される出し手の映像データの再生時刻及び再生方法が決定され、これに基づいて予測優先データである予測される出し手の映像データの再生データが生成される。 Further, even when there is prediction priority data, the reproduction time and reproduction method of the video data of the predicted sender which is the sent prediction priority data are determined, and based on this, the predicted sender which is the prediction priority data is determined. Playback data of the video data of is generated.

また、第１の情報処理システム２０Ａでは、第２の情報処理システム２０Ｂより送られてきた非優先データであるエンコードされたデータはデコードされる。 Further, in the first information processing system 20A, the encoded data which is the non-priority data sent from the second information processing system 20B is decoded.

デコードされたデータのうち、既に優先データ又は予測優先データを用いた再生が行われているデータがある場合、先行して行われた優先データ又は予測優先データによる再生とデコードされたデータによる再生が旨くつながるように補間データが生成される。 If there is data that has already been reproduced using the priority data or the prediction priority data among the decoded data, the reproduction by the priority data or the prediction priority data performed in advance and the reproduction by the decoded data are performed. Interpolated data is generated so that it connects well.

デコードされたデータ、生成された再生データ及び補間データは、データの再生時刻に従ってデータのソートが行われた上で、映像再生機２３１Ａ、音声再生機２３２Ａに出力され、再生される。これにより、図４（Ｄ）に示すように表示部２３３Ａに再生映像が表示される。
図４（Ｄ）は、表示部２３３Ａに表示されるＢ地点の再生映像を示す。The decoded data, the generated reproduction data, and the interpolation data are sorted according to the reproduction time of the data, and then output to the video player 231A and the audio player 232A for reproduction. As a result, the reproduced video is displayed on the display unit 233A as shown in FIG. 4 (D).
FIG. 4D shows a reproduced image at point B displayed on the display unit 233A.

図４（Ｄ）に示すように、破線の楕円で囲まれている出し手の「グー」は、優先データとして優先的に送られてきた出し手の映像データ（画素値）に基づくものであり、出し手以外の映像部分は、非優先データに基づく映像となっている。このように、優先的に送られた優先データに基づく出し手の映像は、先に送られている非優先データに基づく映像に重畳されて表示される。
これにより、じゃんけんというシーンに重要な出し手の映像は、遅延が抑制されて通信相手の情報処理装置で再生されることになる。As shown in FIG. 4 (D), the "goo" of the sender surrounded by the broken line ellipse is based on the video data (pixel value) of the sender sent preferentially as the priority data, and is based on the video data (pixel value) of the sender. The video parts other than the above are videos based on non-priority data. In this way, the video of the sender based on the priority data sent preferentially is superimposed and displayed on the video based on the non-priority data sent earlier.
As a result, the video of the rock-paper-scissors scene, which is important for the scene, is reproduced by the information processing device of the communication partner with the delay suppressed.

第１の情報処理システム２０Ａ側のユーザであるＡさんは、この図４（Ｄ）に示される再生映像をみて、じゃんけんを行うことになる。つまり、Ａさんは、「さいしょはグー」の「グー」にあわせてＢさんが出すグーの手の映像を表示部２３３Ａで確認して、次のかけ声である「じゃんけんぽん」を発することになる。 Mr. A, who is the user of the first information processing system 20A, will play rock-paper-scissors by watching the reproduced video shown in FIG. 4 (D). In other words, Mr. A confirms the image of Goo's hand that Mr. B puts out in accordance with "Goo" of "Saisho wa Goo" on the display unit 233A, and decides to issue the next call "rock-paper-scissors". Become.

これにより、図４（Ａ）に示すように、Ａさんにおける、「さいしょはグー」のかけ声から、次の「じゃんけんぽん」のかけ声を発するまでの待ち時間は、１コマ分となる。つまり、図５を用いて説明した比較例と比べて、ＡさんのＢさんからの情報待ち時間が短縮される。 As a result, as shown in FIG. 4 (A), the waiting time for Mr. A from the shout of "Saisho wa Goo" to the next shout of "Rock-paper-scissors" is one frame. That is, the information waiting time from Mr. A and Mr. B is shortened as compared with the comparative example described with reference to FIG.

このように待ち時間が短縮されることにより、単位時間当たりの効率が向上する。また、待ち時間が短縮されることにより、ユーザが感じる遅延感が減少し、ＡさんとＢさんとのコミュニケーションをより自然に、より円滑なものとすることができる。 By reducing the waiting time in this way, the efficiency per unit time is improved. Further, by shortening the waiting time, the feeling of delay felt by the user is reduced, and the communication between Mr. A and Mr. B can be made more natural and smooth.

以上のように、本技術においては、場所の離れた複数の地点間での通信において、シーンに応じた重要なデータが抽出されて優先的に通信相手の情報処理装置に送信されるので、通信相手に、重要な情報を、遅延を抑制して、提示することができる。 As described above, in the present technology, in communication between a plurality of remote points, important data according to the scene is extracted and preferentially transmitted to the information processing device of the communication partner. Important information can be presented to the other party with reduced delay.

これにより、例えば、通信インフラが整備されていない地域や国土の広い国での通信、他国間での通信等、伝送遅延の大きくなりやすい通信状況においても、通信相手に、シーンに応じた重要な情報をより早く提示することができ、ユーザに与える通信の遅延感を減少させることができる。 As a result, it is important for the communication partner according to the scene even in the communication situation where the transmission delay is likely to be large, such as communication in an area where the communication infrastructure is not developed, communication in a country with a large land area, communication between other countries, etc. Information can be presented faster, and the feeling of delay in communication given to the user can be reduced.

また、本技術は、センサによって取得した情報等を自動でサーバへ送信するようなＩｏＴ(Internet of Things)端末を用い、ＩｏＴ端末と物理的に近い位置に設けられた複数のサーバ（エッジサーバ）でデータ分散処理を行うエッジングコンピューティングにも適用することができる。 In addition, this technology uses an IoT (Internet of Things) terminal that automatically sends information acquired by a sensor to a server, and a plurality of servers (edge servers) provided at positions physically close to the IoT terminal. It can also be applied to edging computing that performs data distribution processing in.

（遅延制御に係る情報処理方法の他のシーンでの適用例）
上述の実施形態においては、じゃんけんのシーンを例にあげて説明したが、これに限定されない。認識されるシーンに応じて、通信相手の情報処理システムに優先的に送信するデータの内容を異ならせ、シーンに応じた重要な情報を優先的に送信することによって、通信におけるユーザが感じる遅延感を減少させることができる。
以下、じゃんけん以外の他のシーンへの適用例について説明するが、これらシーンに限定されることはない。(Example of application of information processing method related to delay control in other scenes)
In the above-described embodiment, the rock-paper-scissors scene has been described as an example, but the present invention is not limited to this. By differentiating the content of the data to be preferentially transmitted to the information processing system of the communication partner according to the recognized scene and preferentially transmitting important information according to the scene, the user feels a delay in communication. Can be reduced.
An example of application to other scenes other than rock-paper-scissors will be described below, but the present invention is not limited to these scenes.

例えば、テレビ会議をしているシーンの場合は、話始めそうな人の口の動きや声が重要となってくるため、優先データとして、センシングデータから、話を始めそうな人の口の部分の映像データ、人の声のデータが抽出される。 For example, in the case of a video conference, the movement and voice of the person who is likely to start talking becomes important, so as priority data, the part of the mouth of the person who is likely to start talking is used as priority data. Video data and human voice data are extracted.

このようなデータが優先的に送信相手の情報処理装置に送信されることで、テレビ会議で、異なる地点それぞれにいるユーザの話し始めが衝突してしまうのを防止することができる。これにより、複数の異なる地点間でより円滑なコミュニケーションが可能となり、また、ユーザに与える通信の遅延感を減少させることができる。 By preferentially transmitting such data to the information processing device of the transmission partner, it is possible to prevent users at different points from colliding with each other in the video conference. As a result, smoother communication is possible between a plurality of different points, and it is possible to reduce the feeling of communication delay given to the user.

また、他の例として、Ａ地点にいる人が、Ｂ地点にいる人の声による指示を聞きながら、Ｂ地点にある物体を例えばロボットハンドによって移動操作をする等の遠隔操作のシーンの場合にも、本技術を適用することができる。 Further, as another example, in the case of a remote control scene in which a person at point A moves an object at point B, for example, with a robot hand while listening to instructions from the voice of a person at point B. Also, this technology can be applied.

このような遠隔操作のシーンの場合、「ストップ」などの声による指示（指示の声のデータ）、移動操作対象物体へ近づいてくる物体であるロボットハンドの映像データが、優先データとして抽出される。 In the case of such a remote control scene, voice instructions such as "stop" (data of the voice of the instruction) and video data of the robot hand, which is an object approaching the object to be moved, are extracted as priority data. ..

Ａ地点にいる人が、Ｂ地点にいる人の指示を聞きながら、Ｂ地点にある物体の移動操作をする例では、Ｂ地点が送信地点である場合、取得されたＢ地点の音データから、Ｂ地点にいる人が発する指示の声のデータが抽出される。更に、取得されたＢ地点の映像データから、ロボットハンドの部分の映像データが抽出される。
これら抽出された指示の声のデータ及びロボットハンドの部分の映像データ（画素値）は、優先データとして、受信側となるＡ地点の情報処理装置に優先的に送信される。In the example in which a person at point A moves an object at point B while listening to the instructions of a person at point B, when point B is the transmission point, the acquired sound data at point B is used. The data of the instruction voice issued by the person at the B point is extracted. Further, the video data of the robot hand portion is extracted from the acquired video data of the B point.
The extracted instruction voice data and the video data (pixel value) of the robot hand portion are preferentially transmitted to the information processing device at point A on the receiving side as priority data.

また、遠隔操作の他のシーン例として、Ａ地点にいる人が、Ｂ地点にいる人の指示を聞きながら、Ａ地点にある物体の移動操作を、ロボットハンドを用いて行う例では、Ｂ地点が送信地点であるとき、Ｂ地点で取得される音データから、Ｂ地点にいる人が発する指示の声のデータが抽出される。この指示の声のデータは、優先データとして、受信地点となるＡ地点の情報処理装置に優先的に送信される。
一方、Ａ地点が送信地点であるとき、Ａ地点で取得される映像データから、ロボットハンドの部分の映像データが抽出される。このロボットハンドの部分の映像データは、優先データとして、受信地点となるＢ地点の情報処理装置に優先的に送信される。Further, as another example of the remote control scene, in the example in which the person at the point A performs the movement operation of the object at the point A while listening to the instruction of the person at the point B using the robot hand, the point B is used. When is the transmission point, the voice data of the instruction issued by the person at the B point is extracted from the sound data acquired at the B point. The voice data of this instruction is preferentially transmitted to the information processing device at point A, which is the receiving point, as priority data.
On the other hand, when the point A is the transmission point, the video data of the robot hand portion is extracted from the video data acquired at the point A. The video data of the robot hand portion is preferentially transmitted to the information processing device at point B, which is the receiving point, as priority data.

以上のように、遠隔操作のシーンにおいて重要となる指示の声のデータ、ロボットハンドの部分の映像データが優先的に通信相手の情報処理装置に送信されることにより、ユーザに与える通信の遅延感を減少させることができる、より円滑な遠隔操作を行うことができる。 As described above, the communication delay feeling given to the user by preferentially transmitting the instruction voice data and the video data of the robot hand part, which are important in the remote control scene, to the information processing device of the communication partner. Can be reduced, and smoother remote control can be performed.

また、更に他の例として、遠隔指差しのシーンの場合にも、本技術を適用することができる。
遠隔指差しのシーンとは、例えば、Ａ地点の表示部２３３ＡとＢ地点の表示部２３３ＢそれぞれにＡ地点及びＢ地点両方の映像が表示され、Ａ地点にいる人が表示部２３３Ａに表示されるＢ地点の映像に写し出される物体を指差しするシーンをいう。この場合、表示部２３３Ｂに表示されるＢ地点の映像には、Ａ地点で指差しされた箇所が指差しポイントの形態で表示される。Further, as yet another example, the present technology can be applied to the case of a remote pointing scene.
The remote pointing scene is, for example, that the images of both the A point and the B point are displayed on the display unit 233A at the A point and the display unit 233B at the B point, respectively, and the person at the A point is displayed on the display unit 233A. A scene that points to an object projected in the image at point B. In this case, in the image of the point B displayed on the display unit 233B, the point pointed at the point A is displayed in the form of the pointing point.

遠隔指差しのシーンの場合、映像上の指の動きが重要となるので、Ａ地点で取得された映像データのうち、指差しをしている指の部分の映像データ（画素値）が抽出され、優先データとして、Ｂ地点の情報処理装置に優先的に送信される。
このように指の部分の映像データが優先的に送信されることにより、送信側の指の指差し方向と受信側で表示される指差しポイントが同期され、ユーザに与える通信の遅延感を減少させることができる。In the case of a remote pointing scene, the movement of the finger on the image is important, so the image data (pixel value) of the part of the finger pointing is extracted from the image data acquired at point A. , As priority data, it is preferentially transmitted to the information processing device at point B.
By preferentially transmitting the video data of the finger portion in this way, the pointing direction of the finger on the transmitting side and the pointing point displayed on the receiving side are synchronized, and the feeling of delay in communication given to the user is reduced. Can be made to.

本技術の実施の形態は、上述した実施の形態に限定されるものではなく、本技術の要旨を逸脱しない範囲において種々の変更が可能である。 The embodiment of the present technology is not limited to the above-described embodiment, and various changes can be made without departing from the gist of the present technology.

例えば、上述の実施形態においては２地点間での通信を例にあげたが、３つ以上の複数の地点間での通信に適用することもできる。 For example, in the above-described embodiment, communication between two points is taken as an example, but it can also be applied to communication between three or more points.

また、上述の実施形態においては、取得するセンシングデータとして、音データ、映像データ、デプスデータを例にあげたが、少なくとも映像データがあればよい。シーンに応じて映像データから部分的にデータを抽出して優先的に通信相手の情報処理装置に送信することにより、ユーザに与える通信の遅延感を減少させることができる。 Further, in the above-described embodiment, the sound data, the video data, and the depth data are given as examples as the sensing data to be acquired, but at least the video data may be present. By partially extracting data from the video data according to the scene and preferentially transmitting it to the information processing device of the communication partner, it is possible to reduce the feeling of delay in communication given to the user.

なお、本技術は以下のような構成もとることができる。 The present technology can have the following configurations.

（１）
送信地点に関するデータを取得する取得部と、
上記送信地点に関するデータをエンコードするエンコード部と、
上記送信地点に関するデータを用いて認識した上記送信地点の状況に基づいて、上記送信地点に関するデータから、優先的に送信するデータを設定するコンテキスト認識部と、
上記コンテキスト認識部での設定に基づいて、上記優先的に送信するデータを優先データとして抽出する優先データ抽出部と、
上記エンコード部によりエンコードされたデータと、エンコードされていない上記優先データを、受信地点にある情報処理装置へ送信する通信部と
を具備する情報処理装置。(1)
An acquisition unit that acquires data related to the transmission point,
An encoding unit that encodes the data related to the above transmission point, and
A context recognition unit that sets priority transmission data from the data related to the transmission point based on the situation of the transmission point recognized using the data related to the transmission point.
Based on the settings in the context recognition unit, the priority data extraction unit that extracts the data to be preferentially transmitted as priority data, and the priority data extraction unit.
An information processing device including a communication unit that transmits data encoded by the encoding unit and unencoded priority data to an information processing device at a receiving point.

（２）
上記（１）に記載の情報処理装置であって、
上記優先データ抽出部は、上記優先データとして、上記優先的に送信するデータと、上記送信地点の状況と、上記優先的に送信するデータの再生時刻と、を抽出する
情報処理装置。(2)
The information processing device according to (1) above.
The priority data extraction unit is an information processing device that extracts the data to be preferentially transmitted, the status of the transmission point, and the reproduction time of the data to be preferentially transmitted as the priority data.

（３）
上記（１）又は（２）に記載の情報処理装置であって、
上記優先データを保存する保存部と、
上記保存部に保存された優先データを基に、優先的に送信するデータを予測する優先データ予測部と
を更に具備する情報処理装置。(3)
The information processing device according to (1) or (2) above.
A storage unit that stores the above priority data,
An information processing device further including a priority data prediction unit that predicts data to be preferentially transmitted based on the priority data stored in the storage unit.

（４）
上記（１）〜（３）のうちいずれか１つに記載の情報処理装置であって、
上記送信地点に関するデータは、映像データを含む
情報処理装置。(4)
The information processing device according to any one of (1) to (3) above.
The data related to the transmission point is an information processing device including video data.

（５）
上記（４）に記載の情報処理装置であって、
上記送信地点に関するデータは、音データとデプスデータの少なくとも一方を更に含む
情報処理装置。(5)
The information processing device according to (4) above.
The data related to the transmission point is an information processing device that further includes at least one of sound data and depth data.

（６）
送信地点の情報処理装置から、上記送信地点に関するデータがエンコードされたデータと、上記送信地点に関するデータから抽出されエンコードされていない優先データを受信する通信部と、
上記エンコードされたデータをデコードするデコード部と、
上記エンコードされていない優先データの再生時刻及び再生方法を決定する決定部と、
上記決定部での決定に基づいて、上記優先データの再生データを生成する再生データ生成部と、
上記デコード部でデコードされたデータと上記優先データの再生データを出力する出力部と
を具備する情報処理装置。(6)
A communication unit that receives encoded data related to the transmission point and priority data extracted from the data related to the transmission point and unencoded from the information processing device at the transmission point.
The decoding unit that decodes the encoded data and
A determination unit that determines the playback time and playback method of the unencoded priority data, and
A reproduction data generation unit that generates reproduction data of the priority data based on the determination in the determination unit, and a reproduction data generation unit.
An information processing device including an output unit that outputs data decoded by the decoding unit and playback data of the priority data.

（７）
上記（６）に記載の情報処理装置であって、
上記決定部での決定の内容を保存する保存部と、
上記保存部に保存された決定内容を参照して、上記デコードされたデータのうち、上記優先データの再生データによって既に再生されているデータがあるか否かを確認する再生済み確認部と、
上記再生済み確認部により上記優先データの再生データが既に再生されている場合に、上記優先データの再生データと上記デコードされたデータを繋ぎ合わせるための補間データを生成する補間データ生成部と
を更に具備する情報処理装置。(7)
The information processing device according to (6) above.
A storage unit that stores the contents of the decisions made by the above determination unit,
With reference to the decision contents saved in the storage unit, a reproduction confirmation unit for confirming whether or not there is data that has already been reproduced by the reproduction data of the priority data among the decoded data, and a reproduction confirmation unit.
When the reproduction data of the priority data has already been reproduced by the reproduction confirmation unit, the interpolation data generation unit that generates the interpolation data for joining the reproduction data of the priority data and the decoded data is further added. Information processing device to be equipped.

（８）
送信地点にある情報処理装置が、
送信地点に関するデータを取得し、
上記送信地点に関するデータをエンコードし、
上記送信地点に関するデータを用いて認識した上記送信地点の状況に基づいて、上記送信地点に関するデータから、優先的に送信するデータを優先データとして抽出し、
上記エンコードしたデータと、エンコードしていない上記優先データを、受信地点にある情報処理装置に送信し、
上記受信地点にある情報処理装置が、
上記エンコードしたデータと、エンコードしていない上記優先データを受信し、
上記エンコードしたデータをデコードし、
エンコードしていない上記優先データの再生時刻及び再生方法を決定し、
上記決定に基づいて上記優先データの再生データを生成し、
上記デコードしたデータと上記優先データの再生データを出力する
情報処理方法。(8)
The information processing device at the transmission point
Get data about the transmission point,
Encode the data related to the above transmission point and
Based on the situation of the transmission point recognized by using the data related to the transmission point, the data to be preferentially transmitted is extracted as the priority data from the data related to the transmission point.
The encoded data and the unencoded priority data are transmitted to the information processing device at the receiving point.
The information processing device at the above reception point
Receive the above encoded data and the above unencoded priority data,
Decode the above encoded data and
Determine the playback time and playback method of the above priority data that has not been encoded.
Based on the above determination, the reproduction data of the above priority data is generated, and the reproduction data is generated.
An information processing method that outputs the reproduced data of the decoded data and the priority data.

１Ａ、１Ｂ…センシングデータ取得部（取得部）
２Ａ、２Ｂ…データエンコード部（エンコード部）
３Ａ、３Ｂ…コンテキスト認識部
４Ａ、４Ｂ…優先データ抽出部
６Ａ、６Ｂ…優先データ予測部
７Ａ、７Ｂ…通信部
９Ａ、９Ｂ…再生時刻／再生方法決定部（決定部）
１０Ａ、１０Ｂ…データデコード部（デコード部）
１１Ａ、１１Ｂ…優先データ再生保存部（保存部）
１２Ａ、１２Ｂ…再生データ生成部
１３Ａ、１３Ｂ…優先データ再生済み確認部（再生済み確認部）
１４Ａ、１４Ｂ…補間データ生成部
１５Ａ、１５Ｂ…再生データ出力部（出力部）
２０Ａ…第１の情報処理装置（送信地点にある情報処理装置、受信地点にある情報処理装置）
２０Ｂ…第２の情報処理装置（送信地点にある情報処理装置、受信地点にある情報処理装置）
５０…情報処理システム1A, 1B ... Sensing data acquisition unit (acquisition unit)
2A, 2B ... Data encoding section (encoding section)
3A, 3B ... Context recognition unit 4A, 4B ... Priority data extraction unit 6A, 6B ... Priority data prediction unit 7A, 7B ... Communication unit 9A, 9B ... Playback time / playback method determination unit (decision unit)
10A, 10B ... Data decoding unit (decoding unit)
11A, 11B ... Priority data playback storage unit (storage unit)
12A, 12B ... Playback data generation unit 13A, 13B ... Priority data playback completed confirmation unit (reproduced confirmation unit)
14A, 14B ... Interpolation data generation unit 15A, 15B ... Reproduction data output unit (output unit)
20A ... First information processing device (information processing device at the transmission point, information processing device at the reception point)
20B ... Second information processing device (information processing device at the transmission point, information processing device at the reception point)
50 ... Information processing system

Claims

An acquisition unit that acquires data related to the transmission point,
An encoding unit that encodes data related to the transmission point, and
A context recognition unit that sets data to be preferentially transmitted from the data related to the transmission point based on the situation of the transmission point recognized using the data related to the transmission point.
A priority data extraction unit that extracts the data to be preferentially transmitted as priority data based on the settings in the context recognition unit, and a priority data extraction unit.
An information processing device including a communication unit that transmits data encoded by the encoding unit and unencoded priority data to an information processing device at a receiving point.

The information processing device according to claim 1.
The priority data extraction unit is an information processing device that extracts the data to be preferentially transmitted, the status of the transmission point, and the reproduction time of the data to be preferentially transmitted as the priority data.

The information processing device according to claim 2.
A storage unit that stores the priority data and
An information processing device further comprising a priority data prediction unit that predicts data to be preferentially transmitted based on the priority data stored in the storage unit.

The information processing device according to claim 3.
The data related to the transmission point is an information processing device including video data.

The information processing device according to claim 4.
The data related to the transmission point is an information processing device that further includes at least one of sound data and depth data.

A communication unit that receives encoded data related to the transmission point and priority data extracted from the data related to the transmission point and unencoded from the information processing device at the transmission point.
A decoding unit that decodes the encoded data, and
A determination unit that determines the playback time and playback method of the unencoded priority data, and
A reproduction data generation unit that generates reproduction data of the priority data based on the determination in the determination unit, and a reproduction data generation unit.
An information processing device including an output unit that outputs data decoded by the decoding unit and reproduced data of the priority data.

The information processing device according to claim 6.
A storage unit that stores the content of the decision made by the determination unit,
With reference to the decision contents saved in the storage unit, a reproduction confirmation unit for confirming whether or not there is data already reproduced by the reproduction data of the priority data among the decoded data, and a reproduction confirmation unit.
When the reproduced data of the priority data has already been reproduced by the reproduced confirmation unit, an interpolation data generation unit that generates interpolation data for joining the reproduced data of the priority data and the decoded data is further added. Information processing device to be equipped.

The information processing device at the transmission point
Get data about the transmission point,
Encode the data about the transmission point and
Based on the situation of the transmission point recognized by using the data related to the transmission point, the data to be preferentially transmitted is extracted as priority data from the data related to the transmission point.
The encoded data and the unencoded priority data are transmitted to the information processing device at the receiving point, and the data is transmitted.
The information processing device at the reception point
Upon receiving the encoded data and the unencoded priority data,
Decode the encoded data and
Determine the playback time and playback method of the unencoded priority data,
Based on the determination, the reproduction data of the priority data is generated, and the reproduction data is generated.
Outputs the decoded data and the reproduced data of the priority data.
Information processing method.