JP6186627B2

JP6186627B2 - Multimedia device and program

Info

Publication number: JP6186627B2
Application number: JP2013200409A
Authority: JP
Inventors: 山田　誠; 誠山田; 堀内　俊治; 俊治堀内
Original assignee: KDDI Corp
Current assignee: KDDI Corp
Priority date: 2013-09-26
Filing date: 2013-09-26
Publication date: 2017-08-30
Anticipated expiration: 2033-09-26
Also published as: JP2015070324A

Description

本発明は、ディスプレイに表示される画像と複数のラウドスピーカから出力される音声とを定位させる技術に関する。 The present invention relates to a technique for localizing an image displayed on a display and sound output from a plurality of loudspeakers.

近年、記録装置の大容量化が進み、パーソナルコンピュータ、携帯電話器、スマートフォンなどに代表される情報機器では、多数のビデオコンテンツやオーディオコンテンツを保存し、再生することができるようになっている。さらに、インターネットを介して、多数のビデオコンテンツやオーディオコンテンツを再生することもできるようになっている。ここで、ビデオコンテンツを再生する際、一般に、そのビデオは情報機器に設けられたディスプレイで再生され、また、そのオーディオは情報機器に内蔵された１つ以上のラウドスピーカから再生される。 2. Description of the Related Art In recent years, the capacity of recording devices has been increased, and information devices represented by personal computers, mobile phones, smartphones, and the like can store and reproduce a large number of video contents and audio contents. Furthermore, a large number of video contents and audio contents can be reproduced via the Internet. Here, when video content is played back, the video is generally played back on a display provided in the information device, and the audio is played back from one or more loudspeakers built in the information device.

しかしながら、情報機器に内蔵された１つ以上のラウドスピーカの位置は、当然ながら、予め固定されている。一方、ビデオコンテンツの再生において、一般に、そのビデオ映像は横長のディスプレイで再生されることが想定されている。このため、複数のラウドスピーカを有する情報機器において、ディスプレイを横長に配置した場合を想定して複数のラウドスピーカが設置されていれば、ビデオ再生におけるオーディオのステレオ再生に適するが、そうでない場合には、ビデオとオーディオとの定位が合致せず、非常に不自然となる。 However, the position of one or more loudspeakers built in the information device is naturally fixed in advance. On the other hand, in the reproduction of video content, it is generally assumed that the video image is reproduced on a horizontally long display. For this reason, in information equipment having a plurality of loudspeakers, if a plurality of loudspeakers are installed assuming that the display is placed horizontally long, it is suitable for stereo playback of audio in video playback. Is very unnatural because the localization of video and audio does not match.

例えば、スマートフォンのように、平面図に表すと略長方形となる筐体を有する情報機器の場合、２つのラウドスピーカが、上記略長方形のうちの一つの短辺側に並設されていることが多い。このようなスマートフォンでビデオコンテンツを再生する場合、ユーザは、ディスプレイを横長にしてビデオコンテンツを視聴することとなる。このとき、略長方形の一つの短辺側に並設された２つのラウドスピーカから音声が出力されるので、ユーザにとっては、左右どちらかの一方向だけから音声が聴こえることとなる。このような状況では、ビデオとオーディオとの定位が合致せず、ユーザに対して、違和感を与えてしまう。特に、オーディオでステレオ再生をする場合は、著しく音響効果が損なわれる。 For example, in the case of an information device having a housing that is substantially rectangular when shown in a plan view, such as a smartphone, two loudspeakers may be arranged in parallel on one short side of the substantially rectangular shape. Many. When video content is played back on such a smartphone, the user views the video content with the display horizontally long. At this time, since the sound is output from two loudspeakers arranged in parallel on one short side of the substantially rectangular shape, the user can hear the sound from only one of the left and right directions. In such a situation, the localization of video and audio does not match, giving the user a sense of discomfort. In particular, in the case of stereo reproduction with audio, the acoustic effect is significantly impaired.

特許文献１では、３つのラウドスピーカをディスプレイの上部中央、左下、右下に設け、情報機器の向きを検出し、ステレオ再生に適した２つのラウドスピーカを選択することにより、上記の問題点の解決を図っている。また、特許文献２では、３つのラウドスピーカをディスプレイの３隅に設け、情報機器の向きを検出し、ステレオ再生に適した２つのラウドスピーカを選択することにより、上記の問題点の解決を図っている。さらに、特許文献３では、２つのラウドスピーカをディスプレイの対角線上に設け、情報機器の向きを検出し、左チャネルと右チャネルを切り替えることにより、上記の問題点の解決を図っている。 In Patent Document 1, three loudspeakers are provided in the upper center, lower left, and lower right of the display, the orientation of the information device is detected, and two loudspeakers suitable for stereo reproduction are selected, so that the above problem is solved. We are trying to solve it. In Patent Document 2, three loudspeakers are provided at the three corners of the display, the orientation of the information device is detected, and two loudspeakers suitable for stereo reproduction are selected to solve the above problem. ing. Furthermore, in Patent Document 3, two loudspeakers are provided on the diagonal line of the display, the orientation of the information device is detected, and the left channel and the right channel are switched to solve the above problem.

特開２００６−１７４２７７号公報JP 2006-174277 A 特開２００９−２３２４２４号公報JP 2009-232424 A 特開２００３−０７８６０１号公報JP2003-078601A

しかしながら、特許文献１および特許文献２に記載されている技術では、３つのラウドスピーカが必要であり、機器点数が増えてしまう。その結果、情報機器の小型化が容易ではなくなり、コストも上昇してしまう恐れがある。また、特許文献３の方法では、２つのラウドスピーカが対角線上に配置されており、ビデオとオーディオが対角線上に定位するため、情報機器が大きい場合、非常に不自然となる。このため、少ない部品点数では、高い満足度を得ることができなかった。さらに、従来の技術では、ディスプレイを縦にするか横にするかという点のみが考慮されており、ユーザがビデオコンテンツを視聴する位置を考慮したものではない。このため、従来の技術では、ユーザの位置に応じたオーディオの最適化を図ることができなかった。 However, in the techniques described in Patent Document 1 and Patent Document 2, three loudspeakers are required, and the number of devices increases. As a result, it is not easy to downsize the information device, and the cost may increase. Further, in the method of Patent Document 3, two loudspeakers are arranged on a diagonal line, and video and audio are localized on the diagonal line, which is very unnatural when the information device is large. For this reason, high satisfaction cannot be obtained with a small number of parts. Furthermore, the conventional technology only considers whether the display is set to be vertical or horizontal, and does not consider the position where the user views the video content. For this reason, the conventional technology cannot optimize the audio in accordance with the position of the user.

本発明は、このような事情に鑑みてなされたものであり、予め設置されているラウドスピーカの数および位置を変更することなく、ユーザがビデオコンテンツを視聴する位置に応じて、ビデオとオーディオとの定位を合致させることができるマルチメディア装置およびプログラムを提供することを目的とする。 The present invention has been made in view of such circumstances, and the video and audio can be changed according to the position at which the user views the video content without changing the number and position of the preinstalled loudspeakers. An object of the present invention is to provide a multimedia device and a program capable of matching the localization of the program.

（１）上記の目的を達成するために、本発明は、以下のような手段を講じた。すなわち、本発明のマルチメディア装置は、ディスプレイおよび複数のラウドスピーカを有するマルチメディア装置であって、ユーザがマルチメディアデータを視聴する視聴位置および前記各ラウドスピーカの設置位置を取得する位置情報取得部と、前記各ラウドスピーカから出力される音声が、前記視聴位置で前記ディスプレイに表示される画像に適応する音声となるように、入力信号を補正する仮想音源生成部と、を備えることを特徴とする。 (1) In order to achieve the above object, the present invention takes the following measures. That is, the multimedia apparatus of the present invention is a multimedia apparatus having a display and a plurality of loudspeakers, and a position information acquisition unit that acquires a viewing position where a user views multimedia data and an installation position of each loudspeaker. And a virtual sound source generation unit that corrects an input signal so that sound output from each of the loudspeakers is sound adapted to an image displayed on the display at the viewing position. To do.

このように、各ラウドスピーカから出力される音声が、ユーザの視聴位置でディスプレイに表示される画像に適応する音声となるように、入力信号を補正するので、ユーザがビデオコンテンツを視聴する位置に応じて、仮想的に各ラウドスピーカの設置位置が移動して仮想音源が生成され、ビデオとオーディオとの定位を合致させることが可能となる。 In this way, the input signal is corrected so that the sound output from each loudspeaker becomes sound adapted to the image displayed on the display at the user's viewing position, so that the user can view the video content. Accordingly, the installation position of each loudspeaker is virtually moved to generate a virtual sound source, and the localization of video and audio can be matched.

（２）また、本発明のマルチメディア装置において、前記仮想音源生成部は、前記視聴位置と前記各ラウドスピーカの設置位置との間の音声伝達特性を示す第１の頭部伝達関数、および前記視聴位置と前記ディスプレイに表示される画像に適応する音声を前記視聴位置に出力する仮想ラウドスピーカの設置位置との間の音声伝達特性を示す第２の頭部伝達関数に基づいて作成された少なくとも一つのフィルタを備え、前記フィルタを用いて入力信号を補正することを特徴とする。 (2) In the multimedia device of the present invention, the virtual sound source generation unit includes a first head-related transfer function indicating a sound transfer characteristic between the viewing position and an installation position of each loudspeaker, and At least based on a second head-related transfer function indicating a sound transfer characteristic between a viewing position and a virtual loudspeaker installation position that outputs sound adapted to an image displayed on the display to the viewing position. One filter is provided, and an input signal is corrected using the filter.

このように、第１および第２の頭部伝達関数に基づいて作成されたフィルタを用いて入力信号を補正するので、仮想音源として各ラウドスピーカの設置位置とは異なる空間内の位置に仮想ラウドスピーカを生成し、この仮想音源からユーザの視聴位置でディスプレイに表示される画像に適応する音声を提供することが可能となる。これにより、ユーザがビデオコンテンツを視聴する位置に応じて、ビデオとオーディオとの定位を合致させることが可能となる。 As described above, since the input signal is corrected using the filter created based on the first and second head related transfer functions, the virtual loudspeaker is placed at a position in a space different from the installation position of each loudspeaker as a virtual sound source. It is possible to generate a speaker and provide sound adapted to the image displayed on the display from the virtual sound source at the viewing position of the user. This makes it possible to match the localization of video and audio according to the position where the user views the video content.

（３）また、本発明のマルチメディア装置において、前記仮想音源生成部は、複数の第１の頭部伝達関数および複数の第２の頭部伝達関数を備え、前記位置情報取得部が取得したユーザの視聴位置および前記各ラウドスピーカの設置位置に応じて、前記フィルタを作成し、作成した前記フィルタを用いて入力信号を補正することを特徴とする。 (3) In the multimedia device of the present invention, the virtual sound source generation unit includes a plurality of first head-related transfer functions and a plurality of second head-related transfer functions, and is acquired by the position information acquisition unit. The filter is created according to the viewing position of the user and the installation position of each loudspeaker, and the input signal is corrected using the created filter.

このように、複数の第１の頭部伝達関数および複数の第２の頭部伝達関数を備え、位置情報取得部が取得したユーザの視聴位置および各ラウドスピーカの設置位置に応じて、フィルタを作成ので、ユーザの視聴位置や現時点での各ラウドスピーカの設置位置に応じて、リアルタイムでビデオとオーディオとの定位を合致させることが可能となる。例えば、ユーザが情報機器を９０度回転させてビデオコンテンツを視聴しようとした時は、それに追従して仮想音源の位置を設定し直すことが可能となる。 As described above, the filter includes a plurality of first head-related transfer functions and a plurality of second head-related transfer functions, and the filter is selected according to the viewing position of the user acquired by the position information acquisition unit and the installation position of each loudspeaker. Since it is created, it is possible to match the localization of video and audio in real time according to the viewing position of the user and the current installation position of each loudspeaker. For example, when the user tries to view the video content by rotating the information device by 90 degrees, the position of the virtual sound source can be reset following that.

（４）また、本発明のマルチメディア装置において、前記仮想音源生成部は、複数の視聴位置または複数の前記各ラウドスピーカの設置位置に応じて予め作成された複数のフィルタを備え、前記位置情報取得部が取得した視聴位置および前記各ラウドスピーカの設置位置に応じて、いずれかのフィルタを選択し、選択した前記フィルタを用いて入力信号を補正することを特徴とする。 (4) In the multimedia apparatus of the present invention, the virtual sound source generation unit includes a plurality of filters created in advance according to a plurality of viewing positions or a plurality of installation positions of the respective loudspeakers, and the position information One of the filters is selected according to the viewing position acquired by the acquisition unit and the installation position of each loudspeaker, and the input signal is corrected using the selected filter.

このように、複数の視聴位置または複数の前記各ラウドスピーカの設置位置に応じて予め作成された複数のフィルタを備え、位置情報取得部が取得した視聴位置および前記各ラウドスピーカの設置位置に応じて、いずれかのフィルタを選択するので、処理速度が速く、早期にビデオとオーディオとの定位を合致させることが可能となる。例えば、ユーザの選択操作を契機として、いずれかのフィルタを選択することによって、ユーザの選択に応じて仮想音源の位置を設定することが可能となる。 As described above, a plurality of filters prepared in advance according to a plurality of viewing positions or a plurality of the respective loudspeaker installation positions are provided, and according to the viewing position acquired by the position information acquisition unit and the installation positions of the respective loudspeakers. Since one of the filters is selected, the processing speed is high, and the localization of video and audio can be matched early. For example, the position of the virtual sound source can be set according to the user's selection by selecting one of the filters triggered by the user's selection operation.

（５）また、本発明のプログラムは、ディスプレイおよび複数のラウドスピーカを有するマルチメディア装置の動作を制御するプログラムであって、ユーザがマルチメディアデータを視聴する視聴位置および前記各ラウドスピーカの設置位置を取得する処理と、前記視聴位置と前記各ラウドスピーカの設置位置との間の音声伝達特性を示す第１の頭部伝達関数、および前記視聴位置と前記ディスプレイに表示される画像に適応する音声を前記視聴位置に出力する仮想ラウドスピーカの設置位置との間の音声伝達特性を示す第２の頭部伝達関数に基づいて作成された少なくとも一つのフィルタを用いて、前記各ラウドスピーカから出力される音声が、前記視聴位置で前記ディスプレイに表示される画像に適応する音声となるように、入力信号を補正する処理と、の一連の処理をコンピュータに実行させることを特徴とする。 (5) The program of the present invention is a program for controlling the operation of a multimedia device having a display and a plurality of loudspeakers, and a viewing position where a user views multimedia data and an installation position of each loudspeaker. , A first head-related transfer function indicating a sound transfer characteristic between the viewing position and the installation position of each loudspeaker, and sound adapted to the viewing position and an image displayed on the display Is output from each loudspeaker using at least one filter created based on the second head-related transfer function indicating the sound transfer characteristic between the virtual loudspeaker and the installation position of the virtual loudspeaker. Input signal so that the sound is adapted to the image displayed on the display at the viewing position. A process of positive to, that to execute the series of processing on a computer, characterized.

このように、各ラウドスピーカから出力される音声が、ユーザの視聴位置でディスプレイに表示される画像に適応する音声となるように、入力信号を補正するので、ユーザがビデオコンテンツを視聴する位置に応じて、仮想的に各ラウドスピーカの設置位置が移動して仮想音源が生成され、ビデオとオーディオとの定位を合致させることが可能となる。また、第１および第２の頭部伝達関数に基づいて作成されたフィルタを用いて入力信号を補正するので、仮想音源として各ラウドスピーカの設置位置とは異なる空間内の位置に仮想ラウドスピーカを生成し、この仮想音源からユーザの視聴位置でディスプレイに表示される画像に適応する音声を提供することが可能となる。これにより、ユーザがビデオコンテンツを視聴する位置に応じて、ビデオとオーディオとの定位を合致させることが可能となる。 In this way, the input signal is corrected so that the sound output from each loudspeaker becomes sound adapted to the image displayed on the display at the user's viewing position, so that the user can view the video content. Accordingly, the installation position of each loudspeaker is virtually moved to generate a virtual sound source, and the localization of video and audio can be matched. In addition, since the input signal is corrected using a filter created based on the first and second head related transfer functions, the virtual loudspeaker is placed at a position in a space different from the installation position of each loudspeaker as a virtual sound source. It is possible to provide sound that is generated and adapted to the image displayed on the display at the viewing position of the user from the virtual sound source. This makes it possible to match the localization of video and audio according to the position where the user views the video content.

本発明によれば、各ラウドスピーカから出力される音声が、ユーザの視聴位置でディスプレイに表示される画像に適応する音声となるように、入力信号を補正するので、ユーザがビデオコンテンツを視聴する位置に応じて、仮想的に各ラウドスピーカの設置位置が移動して仮想音源が生成され、ビデオとオーディオとの定位を合致させることが可能となる。 According to the present invention, the input signal is corrected so that the sound output from each loudspeaker is adapted to the image displayed on the display at the user's viewing position, so the user views the video content. Depending on the position, the installation position of each loudspeaker is virtually moved to generate a virtual sound source, and the localization of video and audio can be matched.

本実施形態に係るマルチメディア装置の概念を示す図である。It is a figure which shows the concept of the multimedia apparatus which concerns on this embodiment. ２種類の頭部伝達関数の概要を示す図である。It is a figure which shows the outline | summary of two types of head-related transfer functions. 本実施形態に係るマルチメディア装置の概略構成を示すブロック図である。It is a block diagram which shows schematic structure of the multimedia apparatus which concerns on this embodiment. 各ラウドスピーカの設置位置を所望の位置に仮想的に移動させる動作を示すフローチャートである。It is a flowchart which shows the operation | movement which virtually moves the installation position of each loudspeaker to a desired position.

以下、図面を参照し、本発明を実施するための形態について説明し、本発明の理解に供する。なお、以下の形態は、本発明を具体化した一例であって、本発明の技術的範囲を限定する性格ものではない。 DESCRIPTION OF EMBODIMENTS Hereinafter, embodiments for carrying out the present invention will be described with reference to the drawings for understanding of the present invention. The following embodiments are examples embodying the present invention, and are not intended to limit the technical scope of the present invention.

本実施形態に係るマルチメディア装置は、ディスプレイと２つ以上のラウドスピーカを有し、ビデオとオーディオの定位を合致させるため、ディスプレイと２つ以上のラウドスピーカの位置関係、ジャイロセンサの情報、およびビデオコンテンツの情報に基づいて、仮想音源再生技術を用い、ビデオとオーディオの定位を合致させる。 The multimedia device according to the present embodiment has a display and two or more loudspeakers, and in order to match the localization of video and audio, the positional relationship between the display and the two or more loudspeakers, gyro sensor information, and Based on the video content information, virtual sound source reproduction technology is used to match the localization of video and audio.

図１は、本実施形態に係るマルチメディア装置の概念を示す図である。また、図２は、２種類の頭部伝達関数の概要を示す図である。すなわち、本実施形態では、ビデオとオーディオの定位を合致させるため、頭部伝達関数を利用して、２つ以上のラウドスピーカの位置を仮想的に移動させ、ラウドスピーカの設置位置とは異なる位置で仮想ラウドスピーカから音声を出力する。 FIG. 1 is a diagram showing a concept of a multimedia apparatus according to the present embodiment. FIG. 2 is a diagram showing an outline of two types of head-related transfer functions. That is, in this embodiment, in order to match the localization of video and audio, the position of two or more loudspeakers is virtually moved using a head-related transfer function, and the position is different from the installation position of the loudspeakers. To output sound from the virtual loudspeaker.

まず、ラウドスピーカが２つである場合について説明する。図１および図２において、ラウドスピーカ１からユーザの左耳までおよび右耳までの音響経路の伝達特性を表す頭部伝達関数を、それぞれＨ_1L（ｆ）およびＨ_1R（ｆ）とする。また、ラウドスピーカ２からユーザの左耳までおよび右耳までの頭部伝達関数を、それぞれＨ_2L（ｆ）およびＨ_2R（ｆ）とする。ここで、ｆは周波数のインデックスで、ｔは時間のインデックスである。 First, a case where there are two loudspeakers will be described. In FIG. 1 and FIG. 2, the head-related transfer functions representing the transfer characteristics of the acoustic path from the loudspeaker 1 to the user's left ear and right ear are denoted by H _1L (f) and H _1R (f), respectively. The head-related transfer functions from the loudspeaker 2 to the user's left ear and right ear are _{denoted by} H _2L (f) and H _2R (f), respectively. Here, f is a frequency index, and t is a time index.

一方、２つ以上のラウドスピーカの設置位置を仮想的に所望の位置に移動させるため、移動先である位置からユーザの左耳までおよび右耳までの音響経路の伝達特性を表す頭部伝達関数を、それぞれＶ_1L（ｆ）およびＶ_1R（ｆ）とする。このとき、式（１）に示すように、ユーザの左耳および右耳における音響信号が等しければ、所望の位置に仮想音源を生成することができる。ここで、Ｗ_1L（ｆ）およびＷ_2L（ｆ）は、所望の位置に仮想音源を生成するための、左チャネルの音源信号ｓ_L（ｆ，ｔ）と畳み込みを行う仮想音源再生伝達関数である。なお、周波数のインデックスｆは省略している。 On the other hand, in order to virtually move the installation position of two or more loudspeakers to a desired position, the head-related transfer function representing the transfer characteristic of the acoustic path from the position of the movement destination to the user's left ear and right ear _Are V _1L (f) and V _1R (f), respectively. At this time, as shown in Expression (1), if the acoustic signals in the user's left and right ears are equal, a virtual sound source can be generated at a desired position. Here, W _1L (f) and W _2L (f) are virtual sound source reproduction transfer functions that perform convolution with the left channel sound source signal s _L (f, t) to generate a virtual sound source at a desired position. is there. The frequency index f is omitted.

ここで、式（１）のＨに関する２×２行列をＡ、Ｗに関する２次元ベクトルをｘ、Ｖに関する２次元ベクトルをｂとすれば、式（１）は、式（２）のように表すことができる。

Here, if the 2 × 2 matrix relating to H in Equation (1) is A, the two-dimensional vector relating to W is x, and the two-dimensional vector relating to V is b, Equation (1) can be expressed as Equation (2). be able to.

さらに、ｘは、式（３）のように求められる。

Further, x is obtained as shown in Equation (3).

求められたｘ、すなわちＷ_1L（ｆ）およびＷ_2L（ｆ）は、まず、左チャネルの音源信号ｓ_L（ｆ，ｔ）と畳み込み、左チャネルの音源信号ｓ_L（ｆ，ｔ）に対するラウドスピーカ１および２から再生すべき音響信号成分を得る。さらに、右チャネルの音源信号ｓ_R（ｆ，ｔ）は、同様に、Ｗ_1R（ｆ）およびＷ_2R（ｆ）を求め、右チャネルの音源信号ｓ_R（ｆ，ｔ）に対するラウドスピーカ１および２から再生すべき音響信号成分を得る。最後に、ラウドスピーカ１および２のそれぞれの音響信号成分を加算し、ラウドスピーカ１および２から再生することで、２つのラウドスピーカの位置を仮想的に、所望の位置に移動させることが可能となる。

The obtained x, i.e. W _1L (f) and W _2L (f), first, left channel sound source signal s _L (f, t) and the convolution, loud for the left channel sound source signal s _L (f, t) Sound signal components to be reproduced are obtained from the speakers 1 and 2. Further, the right channel sound source signal s _R (f, t) is similarly obtained as W _1R (f) and W _2R (f), and the loudspeaker 1 for the right channel sound source signal s _R (f, t) and 2 obtains an acoustic signal component to be reproduced. Finally, by adding the respective acoustic signal components of the loudspeakers 1 and 2 and reproducing them from the loudspeakers 1 and 2, it is possible to virtually move the positions of the two loudspeakers to desired positions. Become.

なお、ラウドスピーカが３つ以上の場合、式（１）は、式（４）のように示すことができる。 When there are three or more loudspeakers, equation (1) can be expressed as equation (4).

ここで、Ｎはラウドスピーカの数である。ここで、式（４）のＨに関する２×Ｎ行列をＡ、Ｗに関するＮ次元ベクトルをｘ、Ｖに関する２次元ベクトルをｂとすれば、式（２）より、式（３）は、式（５）のように示すことができる。

Here, N is the number of loudspeakers. Here, if the 2 × N matrix related to H in Equation (4) is A, the N-dimensional vector related to W is x, and the two-dimensional vector related to V is b, Equation (3) can be expressed by Equation (3) from Equation (2). It can be shown as 5).

ここで、添字の＋は、Ｍｏｏｒｅ−Ｐｅｎｒｏｓｅの擬似逆行列である。

Here, the subscript + is a Moore-Penrose pseudo inverse matrix.

また、頭部伝達関数は、複数の位置で測定され、あらかじめ保存されているものとする。情報機器に内蔵されたディスプレイと２つ以上のラウドスピーカの位置情報は、既知であり、あらかじめ保存されているものとする。情報機器に内蔵されたディスプレイの回転方向情報は、情報機器からあらかじめ定義されたアプリケーションプログラミングインターフェースを経て、取得することができる。 The head-related transfer function is measured at a plurality of positions and stored in advance. The position information of the display and the two or more loudspeakers built in the information device is assumed to be known and stored in advance. Information about the rotation direction of the display built in the information device can be acquired from the information device through a predefined application programming interface.

図３は、本実施形態に係るマルチメディア装置の概略構成を示すブロック図である。図３において、入力信号は、ステレオ、あるいはモノラルの音声信号である。位置情報取得部１は、実在するラウドスピーカの設置位置と、仮想的に鳴動する仮想ラウドスピーカのユーザとの相対位置を、仮想音源生成部３に提供する。これらの位置情報は、ユーザの視聴姿勢を既定した上で、固定であっても良い。仮想的に鳴動するラウドスピーカの位置は、ユーザにより自由に変更しても良い。また、カメラ９から取得される映像、ジャイロセンサ１１を用いて取得した情報に基づいて自動的に計測しても良い。 FIG. 3 is a block diagram showing a schematic configuration of the multimedia apparatus according to the present embodiment. In FIG. 3, the input signal is a stereo or monaural audio signal. The position information acquisition unit 1 provides the virtual sound source generation unit 3 with the relative position between the actual loudspeaker installation position and the virtual loudspeaker user who virtually rings. These pieces of position information may be fixed after a user's viewing posture is defined. The position of the loud speaker that rings virtually may be freely changed by the user. Further, it may be automatically measured based on video acquired from the camera 9 and information acquired using the gyro sensor 11.

仮想音源生成部３は、位置情報取得部１から提供されるラウドスピーカの位置情報に基づき、これらのラウドスピーカの位置情報と、予め測定された頭部伝達関数と、実在するスピーカの特性により、ラウドスピーカの位置を仮想的に所望の位置に移動させるためのフィルタを決定する。具体的には、予め作成された複数のフィルタから構成されるフィルタ集合５からいずれかのフィルタを選択し、入力信号との畳み込み演算処理を行ない、実在するラウドスピーカから出力すべき信号（音響信号）を生成する。 Based on the position information of the loudspeakers provided from the position information acquisition unit 1, the virtual sound source generation unit 3 is based on the position information of these loudspeakers, the head-related transfer function measured in advance, and the characteristics of the existing speakers. A filter for virtually moving the position of the loudspeaker to a desired position is determined. Specifically, one of the filters is selected from the filter set 5 made up of a plurality of filters prepared in advance, is subjected to a convolution calculation process with the input signal, and is a signal (acoustic signal) to be output from an existing loudspeaker. ) Is generated.

ラウドスピーカの位置を仮想的に所望の位置に移動させるためのフィルタは、ラウドスピーカの設置位置と、予め測定された頭部伝達関数と、実在するスピーカの特性に基づいて、ラウドスピーカの位置情報の変更にあわせて、都度導出しても良い。 A filter for virtually moving the loudspeaker position to a desired position is based on the loudspeaker installation position, the head-related transfer function measured in advance, and the characteristics of the existing loudspeaker. It may be derived each time according to the change.

図４は、各ラウドスピーカの設置位置を所望の位置に仮想的に移動させる動作を示すフローチャートである。まず、ディスプレイと２つ（以上）のラウドスピーカの位置情報の読み込みを行なう（ステップＳ１）。次に、ディスプレイの回転方向情報を取り込む（ステップＳ２）。例えば、ユーザが、それまで縦長で使用していたディスプレイを、ビデオコンテンツを視聴するために、横長となるように回転させたような場合が、これに該当する。次に、２つ（以上）のラウドスピーカの位置を、所望の位置に仮想的に移動させて（ステップＳ３）、ステップＳ２に遷移する。 FIG. 4 is a flowchart showing an operation of virtually moving the installation position of each loudspeaker to a desired position. First, position information of the display and two (or more) loudspeakers is read (step S1). Next, the rotation direction information of the display is taken in (step S2). For example, this is the case when the user has rotated a display that has been used in a portrait orientation so that the display is in landscape orientation in order to view video content. Next, the position of two (or more) loudspeakers is virtually moved to a desired position (step S3), and the process proceeds to step S2.

このように、本実施形態によれば、各ラウドスピーカから出力される音声が、ユーザの視聴位置でディスプレイに表示される画像に適応する音声となる。これにより、ユーザがビデオコンテンツを視聴する位置に応じて、仮想的に各ラウドスピーカの設置位置が移動して仮想音源が生成され、ビデオとオーディオとの定位を合致させることが可能となる。 Thus, according to the present embodiment, the sound output from each loudspeaker is sound adapted to the image displayed on the display at the viewing position of the user. Thereby, according to the position where the user views the video content, the installation position of each loudspeaker is virtually moved to generate a virtual sound source, and the localization of video and audio can be matched.

１位置情報取得部
３仮想音源生成部
５フィルタ集合
９カメラ
１１ジャイロセンサ
DESCRIPTION OF SYMBOLS 1 Position information acquisition part 3 Virtual sound source generation part 5 Filter set 9 Camera 11 Gyro sensor

Claims

A multimedia device in which a rotatable display and a plurality of loudspeakers are integrated ,
A position information acquisition unit for acquiring a viewing position where a user views multimedia data and a position of the display and each loudspeaker;
A virtual sound source generation unit that corrects an input signal so that sound output from each of the loudspeakers is sound adapted to an image displayed on the display at the viewing position ;
When the display is rotated, the virtual sound source generator generates a first head-related transfer function indicating a sound transfer characteristic between the viewing position after the display rotation and the position of each loudspeaker, and the viewing Created based on the second head-related transfer function indicating the sound transfer characteristics between the position and the position of the virtual loudspeaker that outputs sound adapted to the image displayed on the rotated display to the viewing position multimedia device, characterized that you correct the input signal using at least one filter.

The multimedia apparatus according to claim 1, wherein the position information acquisition unit includes a gyro sensor, and acquires the positions of the display and the loudspeakers based on information acquired using the gyro sensor. .

The virtual sound source generation unit includes a plurality of first head-related transfer functions and a plurality of second head-related transfer functions, and the viewing position of the user acquired by the position information acquisition unit, the display, and each of the loudspeakers depending on the position, to create the filter, the multimedia device according to claim 1, wherein the correcting the input signal using the filter created.

The virtual sound source generating unit includes a plurality of viewing positions or a plurality of displays and a plurality of filters created in advance according to the positions of the loudspeakers, and the viewing position acquired by the position information acquiring unit and the display wherein according to the position of each loudspeaker, select one of the filters, multi-media device of claim 1, wherein the correcting the input signal using the selected filter.

A program for controlling the operation of a multimedia device in which a rotatable display and a plurality of loudspeakers are integrated ,
Processing for obtaining a viewing position where a user views multimedia data and a position of the display and each loudspeaker;
When the display is rotated , a first head-related transfer function indicating a sound transfer characteristic between the viewing position after rotation of the display and the position of each loudspeaker, and the viewing position and the display after rotation Using at least one filter created based on the second head-related transfer function indicating the sound transfer characteristic between the position of the virtual loudspeaker that outputs sound adapted to the image displayed on the viewing position to the viewing position , Causing the computer to execute a series of processes of correcting the input signal so that the sound output from each of the loudspeakers becomes sound adapted to the image displayed on the display at the viewing position. A program characterized by