JP2006121611A

JP2006121611A - Telephone system, telephone system management apparatus, advertisement content distribution method, advertisement content distribution program and recording medium

Info

Publication number: JP2006121611A
Application number: JP2004309929A
Authority: JP
Inventors: Hiroshi Miwa; 浩士三輪; Kumiko Shirai; 久美子白井; Hiroshi Harano; 博志原野; Osamu Minoura; 治箕浦; Tadao Nakamura; 忠雄中村
Original assignee: Nippon Telegraph and Telephone Corp; Nippon Telegraph and Telephone West Corp
Current assignee: Nippon Telegraph and Telephone Corp; Nippon Telegraph and Telephone West Corp
Priority date: 2004-10-25
Filing date: 2004-10-25
Publication date: 2006-05-11

Abstract

<P>PROBLEM TO BE SOLVED: To provide a telephone system, telephone system management apparatus, and advertisement content distribution method in which advertisement contents can also be effectively distributed to telephone terminals utilized by a number of unspecific persons and communication privacy can also be protected. <P>SOLUTION: A telephone system management apparatus 101 controls communication between telephone terminals A, B connected to a communication network 102, selects any advertisement contents out of a plurality of advertisement contents A, B, C registered in advance and distributes the selected advertisement contents to the telephone terminals A, B in use. The telephone system management apparatus comprises a video identification unit 107 for estimating, from video images transmitted from the telephone terminals A, B, user attribute elements of users utilizing the telephone terminals A, B and an audio identification unit 108 for estimating the user attribute elements from audio transmitted from the telephone terminals A, B. Estimated results of both the identification units are used to determine the user attribute elements and the advertisement contents are selected in accordance with the user attribute elements. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

この発明は、電話端末に広告コンテンツを配信するための電話システム、この電話システムを構成するための電話システム管理装置および広告コンテンツ配信方法、ならびに電話システムの広告コンテンツ配信プログラムとコンピュータ読み取り可能な記録媒体に関する。 The present invention relates to a telephone system for distributing advertising content to a telephone terminal, a telephone system management apparatus and an advertising content distribution method for configuring the telephone system, an advertising content distribution program for the telephone system, and a computer-readable recording medium. About.

近年、利用中の電話端末に広告コンテンツを配信し、この電話端末のユーザが広告コンテンツを視聴した時間に応じて、広告主が通信費用の負担等を行うインセンティブ広告サービスが行われている。この種のインセンティブ広告サービスでは、配信する広告コンテンツがその広告対象者に効果的に視聴されることを広告主から求められ、ユーザからは、有益な広告コンテンツの配信を求められる。そこで、これらの要求を実現するべく、種々の電話システムが提案されている。 In recent years, incentive advertising services have been provided in which advertising content is distributed to a telephone terminal that is being used, and the advertiser bears communication costs according to the time when the user of the telephone terminal views the advertising content. In this type of incentive advertising service, the advertiser is required to effectively view the advertising content to be distributed to the advertising target person, and the user is required to distribute useful advertising content. Therefore, various telephone systems have been proposed in order to realize these requirements.

例えば、多地点テレビ会議システムを構成する各テレビ電話端末に対し、多地点接続装置（ＭＣＵ）を介して広告コンテンツを配信する電話システムがある（特許文献１参照）。ところが、特許文献１に記載された電話システムでは、ＭＣＵにより同じ広告コンテンツを各テレビ電話端末に流すので、いわゆる垂れ流し広告となり、広告コンテンツが会議の参加者に注目され難く、広告効果を十分に期待できない。 For example, there is a telephone system that distributes advertising content to each videophone terminal constituting the multipoint video conference system via a multipoint connection unit (MCU) (see Patent Document 1). However, in the telephone system described in Patent Document 1, since the same advertising content is sent to each videophone terminal by the MCU, it becomes a so-called dripping advertisement, and the advertising content is hardly noticed by the participants of the conference, and the advertising effect is sufficiently expected. Can not.

そこで、従来から、所定の基準に応じて予め登録された複数の広告コンテンツの中からいずれかの広告コンテンツを選定し、選定した広告コンテンツを電話端末に配信する電話システムが提案されている。 Therefore, conventionally, a telephone system has been proposed in which one of advertisement contents is selected from a plurality of advertisement contents registered in advance according to a predetermined standard, and the selected advertisement content is distributed to a telephone terminal.

例えば、複数の広告コンテンツと、電話番号に対応付けたユーザの個人情報等とを予め登録しておき、電話システム管理装置側で、発着信番号によりその電話番号に該当するユーザの個人情報を参照し、その個人情報に応じていずれかの広告コンテンツを選定し、その電話番号の電話端末に配信する電話システムがある（特許文献２、特許文献３、特許文献４参照）。ところが、家族や会社単位で共用される電話端末など、不特定多数の者が利用する電話端末に広告コンテンツを配信する場合、特許文献２に記載されたような電話システムでは、登録された個人と実際に電話端末などを利用しているユーザとが一致しないことが多々あるので、効果的に広告コンテンツを流すことができない。また、特許文献２に記載されたような電話システムでは、予め個人情報を登録しておく必要があるので、公衆電話端末を利用するユーザにインセンティブ広告サービスを提供することができない。 For example, a plurality of advertisement contents and personal information of a user associated with a telephone number are registered in advance, and the telephone system management device refers to the personal information of the user corresponding to the telephone number by the calling / called number In addition, there is a telephone system that selects any advertisement content according to the personal information and distributes it to the telephone terminal having the telephone number (see Patent Document 2, Patent Document 3, and Patent Document 4). However, when distributing advertising content to telephone terminals used by an unspecified number of people, such as telephone terminals shared by family members or companies, the telephone system as described in Patent Document 2 Since there are many cases where users who actually use telephone terminals do not match, advertising content cannot be effectively streamed. Further, in the telephone system as described in Patent Document 2, since it is necessary to register personal information in advance, it is not possible to provide an incentive advertising service to users who use public telephone terminals.

さらに、他の構成の電話システムとしては、電話端末間の通話内容を識別する音声識別部を設け、この音声識別部で推定した通話内容に応じて、予め登録された複数の広告コンテンツの中からいずれかの広告コンテンツを選定するようにしたものもある（特許文献５、特許文献６参照）。 Furthermore, as a telephone system having another configuration, a voice identification unit for identifying the content of a call between telephone terminals is provided, and a plurality of pre-registered advertising contents are selected according to the content of the call estimated by the voice identification unit. Some advertisement contents are selected (see Patent Document 5 and Patent Document 6).

特開２００２−１６８９７号公報JP 2002-16897 A 特開平１１−５５４０８号公報JP-A-11-55408 特開２００２−２９０５６５号公報JP 2002-290565 A 特開２００３−０６０７９１号公報JP 2003-060791 A 特開２００２−１６５１９３号公報JP 2002-165193 A 特開２００２−２７１５０７号公報JP 2002-271507 A

しかしながら、特許文献５に記載されたような電話システムでは、不特定多数の者が利用する電話端末に対して効果的に広告コンテンツを配信することはできるが、音声識別部により通話内容が推定されるので、通信の秘密が守られず、ユーザのプライバシーが侵害されるという問題があった。また、音声識別部を電話端末側に実装すると、電話端末を購入するユーザの負担が増すと共に、音声識別に必要な認識辞書の更新をユーザに行ってもらう必要があり、この更新が忘れられると、電話システムの音声識別率が次第に下がり、適切な広告コンテンツを配信することができない恐れもあった。 However, in the telephone system described in Patent Document 5, advertising content can be effectively distributed to telephone terminals used by an unspecified number of people, but the content of the call is estimated by the voice identification unit. Therefore, there is a problem that the secret of communication is not protected and the privacy of the user is infringed. If the voice identification unit is installed on the telephone terminal side, the burden on the user who purchases the telephone terminal increases, and the user needs to update the recognition dictionary necessary for voice identification. However, the voice identification rate of the telephone system gradually decreases, and there is a possibility that appropriate advertising content cannot be distributed.

そこで、この発明の課題は、不特定多数の者が利用する電話端末に対しても効果的に広告コンテンツを配信することができ、通信の秘密を守ることも可能な電話システム、電話システム管理装置および広告コンテンツ配信方法、ならびにそのプログラムと記録媒体を提供することにある。 SUMMARY OF THE INVENTION Accordingly, an object of the present invention is to provide a telephone system and a telephone system management apparatus capable of effectively distributing advertisement contents to telephone terminals used by an unspecified number of persons and protecting the communication secret. And an advertising content distribution method, and a program and a recording medium thereof.

上記の課題を解決する第１の手段として、この発明は、複数の電話端末と、これらの電話端末間の通信を制御する電話システム管理装置と、前記電話システム管理装置と前記電話端末を接続する通信網とからなり、前記電話システム管理装置が、予め登録された複数の広告コンテンツの中からいずれかの広告コンテンツを選定し、選定した広告コンテンツを利用中の前記電話端末に配信する電話システムにおいて、前記電話システム管理装置が、前記電話端末を利用するユーザのユーザ属性要素をこのユーザの音声より推定する音声識別部を備え、この音声識別部で推定された前記ユーザ属性要素に応じて前記広告コンテンツを選定する構成を採用した。 As a first means for solving the above-mentioned problems, the present invention provides a plurality of telephone terminals, a telephone system management apparatus that controls communication between these telephone terminals, and connects the telephone system management apparatus and the telephone terminal. In a telephone system comprising a communication network, wherein the telephone system management apparatus selects any one of a plurality of pre-registered advertising contents and distributes the selected advertising contents to the telephone terminal in use The telephone system management device includes a voice identification unit that estimates a user attribute element of a user who uses the telephone terminal from the voice of the user, and the advertisement is determined according to the user attribute element estimated by the voice identification unit. A configuration for selecting content was adopted.

ここで、ユーザ属性要素は、電話端末を利用するユーザが有する特徴要素のことであり、例えば、性別、年代、人種、国籍などがある。 Here, the user attribute element is a characteristic element possessed by the user who uses the telephone terminal, and includes, for example, sex, age, race, nationality, and the like.

上記第１の手段の構成によれば、前記電話システム管理装置が、前記電話端末を利用するユーザのユーザ属性要素をこのユーザの音声より推定する音声識別部を備え、この音声識別部で推定された前記ユーザ属性要素に応じて前記広告コンテンツを選定するので、通話内容が推定されることなく、推定されたユーザ属性要素の所有者を広告対象とする広告コンテンツが前記電話端末に配信される。例えば、前記広告コンテンツにその広告対象者の属性要素を関連付けて登録しておき、前記電話システム管理装置が、推定したユーザ属性要素と最も近い広告対象者の属性要素が関連付けられた広告コンテンツを選定するようにすればよい。 According to the configuration of the first means, the telephone system management device includes a voice identification unit that estimates a user attribute element of a user who uses the telephone terminal from the voice of the user, and is estimated by the voice identification unit. Since the advertising content is selected according to the user attribute element, the advertising content targeting the owner of the estimated user attribute element is distributed to the telephone terminal without estimating the content of the call. For example, the advertising content attribute element of the advertising subject is registered in association with the advertising content, and the telephone system management apparatus selects the advertising content in which the advertising user attribute element closest to the estimated user attribute element is associated. You just have to do it.

また、上記の課題を解決する第２の手段として、この発明は、複数の電話端末と、これらの電話端末間の通信を制御する電話システム管理装置と、前記電話システム管理装置と前記電話端末を接続する通信網とからなり、前記電話システム管理装置が、予め登録された複数の広告コンテンツの中からいずれかの広告コンテンツを選定し、選定した広告コンテンツを利用中の前記電話端末に配信する電話システムにおいて、前記電話システム管理装置が、前記電話端末からこの電話端末を利用するユーザを写した映像が送信された場合にこのユーザのユーザ属性要素をこの映像より推定する映像識別部を備え、この映像識別部で推定された前記ユーザ属性要素に応じて前記広告コンテンツを選定する構成を採用した。 In addition, as a second means for solving the above-described problem, the present invention includes a plurality of telephone terminals, a telephone system management apparatus that controls communication between these telephone terminals, the telephone system management apparatus, and the telephone terminal. A telephone comprising a communication network to be connected, wherein the telephone system management device selects any one of a plurality of pre-registered advertising contents and distributes the selected advertising contents to the telephone terminal in use In the system, the telephone system management device includes a video identification unit that estimates a user attribute element of the user from the video when the video that shows the user who uses the telephone terminal is transmitted from the telephone terminal. A configuration is adopted in which the advertisement content is selected according to the user attribute element estimated by the video identification unit.

上記第２の手段の構成によれば、前記電話システム管理装置が、前記電話端末を利用するユーザのユーザ属性要素をこのユーザの映像から推定するので、性別、年代、人種だけでなく、例えば、化粧嗜好、装飾品嗜好などの音声識別では識別困難なユーザ属性要素も通話内容によらず推定され、このように推定されたユーザ属性要素の所有者を広告対象とする広告コンテンツが前記電話端末に配信される。例えば、前記広告コンテンツにその広告対象者の属性要素を関連付けて登録しておき、前記電話システム管理装置が、推定したユーザ属性要素と最も近い広告対象者の属性要素が関連付けられた広告コンテンツを選定するようにすればよい。 According to the configuration of the second means, since the telephone system management device estimates the user attribute element of the user who uses the telephone terminal from the video of the user, not only gender, age, race, but also, for example, User attribute elements that are difficult to identify by voice identification, such as makeup preference and ornament preference, are also estimated regardless of the content of the call, and the advertising content targeting the owner of the user attribute element thus estimated is the telephone terminal Delivered to. For example, the advertising content attribute element of the advertising subject is registered in association with the advertising content, and the telephone system management apparatus selects the advertising content in which the advertising user attribute element closest to the estimated user attribute element is associated. You just have to do it.

さらに、上記の課題を解決する第３の手段として、上記第２の手段において、前記電話システム管理装置が、前記電話端末を利用するユーザのユーザ属性要素をこのユーザの音声より推定する音声識別部を備え、この音声識別部と前記映像識別部を併用して前記ユーザ属性要素を推定する構成を採用することができる。この構成によれば、前記ユーザ属性要素が前記音声識別部と前記映像識別部を併用して推定されるので、電話システムによって推定できるユーザ属性要素の種類が多くなり、前記電話端末に対しより効果的に広告コンテンツが配信される。また、両識別部で重複して推定できるユーザ属性要素、例えば、性別、年代、人種の推定結果と現実のユーザが有するユーザ属性要素の一致性が向上する。なお、上記映像は、動画、静止画のいずれでもよく、前記音声識別部による処理および前記映像識別部による処理の実行順序は限定されない。 Furthermore, as a third means for solving the above problem, in the second means, the telephone system management device estimates a user attribute element of a user who uses the telephone terminal from the voice of the user. It is possible to employ a configuration in which the user attribute element is estimated using both the voice identification unit and the video identification unit. According to this configuration, since the user attribute element is estimated by using both the voice identification unit and the video identification unit, the number of types of user attribute elements that can be estimated by the telephone system is increased, which is more effective for the telephone terminal. Advertising content is delivered. In addition, the user attribute elements that can be estimated redundantly in both identification units, for example, the gender, age, and race estimation results and the user attribute elements possessed by the actual user are improved in coincidence. The video may be either a moving image or a still image, and the execution order of the processing by the audio identification unit and the processing by the video identification unit is not limited.

さらに、上記の課題を解決する第４の手段として、上記第１〜第３の手段のいずれかにおいて、前記電話システム管理装置が、ＩＰ網及びＰＳＴＮのうち、少なくともいずれか一方を含むネットワークを介して接続された広告コンテンツ保持装置より前記広告コンテンツを取得する構成を採用することができる。この構成によれば、前記広告コンテンツ保持装置を広告主や広告代理店の管理下に設置することが可能になり、広告主等は、電話システムの保守者を通すことなく配信する広告コンテンツの更新を行える。このため、広告主等は、休日のタイムセールなどの突発的なイベントに応じて広告コンテンツを更新し、電話システムを介して配信することができる。 Furthermore, as a fourth means for solving the above-described problem, in any one of the first to third means, the telephone system management device is connected via a network including at least one of an IP network and a PSTN. The advertisement content can be acquired from the connected advertisement content holding device. According to this configuration, the advertising content holding device can be installed under the control of an advertiser or an advertising agency, and the advertiser or the like can update the advertising content to be distributed without passing through the telephone system maintainer. Can be done. For this reason, an advertiser etc. can update advertisement content according to sudden events, such as a time sale of a holiday, and can distribute it via a telephone system.

さらに、上記の課題を解決する第５の手段として、上記第３または第４の手段において、前記電話システム管理装置が、前記音声識別部による推定処理、前記映像識別部による推定処理、および前記音声識別部と前記映像識別部の併用による推定処理のうち、いずれかの推定処理を、前記電話端末の端末環境に応じて選択し実行する構成を採用することができる。この構成によれば、電話システムが自動的に端末環境に応じた推定処理を行うので、端末環境に応じて適切な広告コンテンツを配信することができる。 Further, as a fifth means for solving the above problem, in the third or fourth means, the telephone system management apparatus performs an estimation process by the voice identification unit, an estimation process by the video identification unit, and the voice. It is possible to adopt a configuration in which any estimation process is selected and executed according to the terminal environment of the telephone terminal among the estimation processes using the identification unit and the video identification unit in combination. According to this configuration, since the telephone system automatically performs estimation processing according to the terminal environment, it is possible to distribute appropriate advertising content according to the terminal environment.

さらに、上記の課題を解決する第６の手段として、上記第１〜第５の手段のいずれかにおいて、前記電話システム管理装置が、通信交換する前記電話端末ごとに前記ユーザ属性要素を推定し、これらの電話端末ごとに前記広告コンテンツを選定する構成を採用することができる。この構成によれば、電話端末ごとにユーザ属性要素が推定され、それに応じて前記広告コンテンツが選定されるので、いずれの電話端末に対しても広告コンテンツが効果的に配信される。 Furthermore, as a sixth means for solving the above-described problem, in any one of the first to fifth means, the telephone system management device estimates the user attribute element for each of the telephone terminals that perform communication exchange, A configuration in which the advertisement content is selected for each of these telephone terminals can be employed. According to this configuration, the user attribute element is estimated for each telephone terminal, and the advertising content is selected accordingly, so that the advertising content is effectively distributed to any telephone terminal.

また、上記の課題を解決する第７の手段として、この発明は、通信網に接続された電話端末間の通信を制御し、予め登録された複数の広告コンテンツの中からいずれかの広告コンテンツを選定し、選定した広告コンテンツを利用中の前記電話端末に配信する電話システム管理装置において、前記電話端末を利用するユーザのユーザ属性要素をこのユーザの音声より推定する音声識別部を備え、この音声識別部で推定された前記ユーザ属性要素に応じて前記広告コンテンツを選定する構成を採用した。この構成によれば、上記第１の手段と同様の作用効果により、利用中の電話端末に広告コンテンツが効果的に配信される。 Further, as a seventh means for solving the above-described problem, the present invention controls communication between telephone terminals connected to a communication network, and selects any one of advertisement contents from a plurality of advertisement contents registered in advance. In the telephone system management apparatus that selects and distributes the selected advertisement content to the telephone terminal that is in use, the telephone system management apparatus includes a voice identification unit that estimates a user attribute element of the user who uses the telephone terminal from the voice of the user. A configuration is adopted in which the advertisement content is selected according to the user attribute element estimated by the identification unit. According to this configuration, the advertising content is effectively distributed to the telephone terminal in use by the same effect as the first means.

また、上記の課題を解決する第８の手段として、この発明は、通信網に接続された電話端末間の通信を制御し、予め登録された複数の広告コンテンツの中からいずれかの広告コンテンツを選定し、選定した広告コンテンツを利用中の前記電話端末に配信する電話システム管理装置において、前記電話端末からこの電話端末を利用するユーザを写した映像が送信された場合にこのユーザのユーザ属性要素をこの映像より推定する映像識別部を備え、この映像識別部で推定された前記ユーザ属性要素に応じて前記広告コンテンツを選定する構成を採用した。この構成によれば、上記第２の手段と同様の作用効果により、利用中の電話端末に広告コンテンツが効果的に配信される。 In addition, as an eighth means for solving the above-described problem, the present invention controls communication between telephone terminals connected to a communication network, and selects any one of advertisement contents from a plurality of advertisement contents registered in advance. In the telephone system management apparatus that selects and distributes the selected advertisement content to the telephone terminal that is being used, when a video showing the user using the telephone terminal is transmitted from the telephone terminal, the user attribute element of the user The video identification unit for estimating the advertisement content is selected from the video, and the advertisement content is selected according to the user attribute element estimated by the video identification unit. According to this configuration, the advertising content is effectively distributed to the telephone terminal in use by the same effect as the second means.

さらに、上記の課題を解決する第９の手段として、上記第８の手段において、前記電話端末を利用するユーザのユーザ属性要素をこのユーザの音声より推定する音声識別部を備え、この音声識別部と前記映像識別部を併用して前記ユーザ属性要素を推定する構成を採用することができる。この構成によれば、上記第３の手段と同様の作用効果により、利用中の電話端末に広告コンテンツが効果的に配信される。 Furthermore, as a ninth means for solving the above-mentioned problem, in the eighth means, a voice identification unit for estimating a user attribute element of a user who uses the telephone terminal from the voice of the user is provided. And the video identification unit can be used in combination to estimate the user attribute element. According to this configuration, the advertising content is effectively distributed to the telephone terminal in use by the same effect as the third means.

さらに、上記の課題を解決する第１０の手段として、上記第９の手段において、前記音声識別部による推定処理、前記映像識別部による推定処理、および前記音声識別部と前記映像識別部の併用による推定処理のうち、いずれかの推定処理を、前記電話端末の端末環境に応じて選択し実行する構成を採用することができる。この構成によれば、上記第５の手段と同様の作用効果により、利用中の電話端末に広告コンテンツが効果的に配信される。 Further, as a tenth means for solving the above problem, in the ninth means, the estimation process by the voice identification unit, the estimation process by the video identification unit, and the combined use of the voice identification unit and the video identification unit A configuration in which any one of the estimation processes is selected and executed according to the terminal environment of the telephone terminal can be employed. According to this configuration, the advertising content is effectively distributed to the telephone terminal in use by the same effect as the fifth means.

さらに、上記の課題を解決する第１１の手段として、上記第７〜第１０の手段のいずれかにおいて、通信交換する前記電話端末ごとに前記ユーザ属性要素を推定し、これらの電話端末ごとに前記広告コンテンツを選定する構成を採用することができる。この構成によれば、上記第６の手段と同様の作用効果により、利用中の電話端末に広告コンテンツが効果的に配信される。 Furthermore, as an eleventh means for solving the above-mentioned problem, in any one of the seventh to tenth means, the user attribute element is estimated for each of the telephone terminals that perform communication exchange, and for each of these telephone terminals, the user attribute element is estimated. A configuration for selecting advertisement content can be employed. According to this configuration, the advertising content is effectively distributed to the telephone terminal in use by the same effect as the sixth means.

また、上記の課題を解決する第１２の手段として、この発明は、複数の電話端末と、これらの電話端末間の通信を制御する電話システム管理装置と、前記電話システム管理装置と前記電話端末を接続する通信網とからなり、前記電話システム管理装置が、予め登録された複数の広告コンテンツの中からいずれかの広告コンテンツを選定するステップと、選定した広告コンテンツを利用中の前記電話端末に配信するステップとを実行する電話システムを用いた広告コンテンツ配信方法において、前記電話システム管理装置が、前記電話端末を利用するユーザのユーザ属性要素をこのユーザの音声より推定する音声識別のステップと、前記電話端末からこの電話端末を利用するユーザを写した映像が送信された場合にこのユーザのユーザ属性要素をこの映像より推定する映像識別のステップと、前記音声識別と前記映像識別を併用して前記ユーザ属性要素を推定するステップのうち、いずれかのステップを、前記電話端末の端末環境に応じて選択し実行するようにした。この方法によれば、上記第５の手段と同様の作用効果により、利用中の電話端末に広告コンテンツが効果的に配信される。 As a twelfth means for solving the above-mentioned problem, the present invention provides a plurality of telephone terminals, a telephone system management apparatus that controls communication between these telephone terminals, the telephone system management apparatus, and the telephone terminal. The telephone system management device includes a communication network to be connected, and the telephone system management device selects any one of a plurality of advertisement contents registered in advance, and distributes the selected advertisement contents to the telephone terminal that is using the advertisement contents. In the advertising content distribution method using the telephone system, the telephone system management apparatus estimates the user attribute element of the user who uses the telephone terminal from the voice of the user, and User attribute element of this user when a video showing the user using this telephone terminal is transmitted from the telephone terminal According to the terminal environment of the telephone terminal, one of a step of video identification estimated from the video and a step of estimating the user attribute element by using both the voice identification and the video identification is selected. I tried to run. According to this method, the advertising content is effectively distributed to the telephone terminal in use by the same effect as the fifth means.

また、上記の課題を解決する第１３の手段として、この発明は、上記第１２の手段に記載の広告コンテンツ配信方法を電話システムに実行させる広告コンテンツ配信プログラムとすることができる。 As a thirteenth means for solving the above problem, the present invention can be an advertising content distribution program for causing a telephone system to execute the advertising content distribution method described in the twelfth means.

また、上記の課題を解決する第１４の手段として、この発明は、上記第１３の手段の広告コンテンツ配信プログラムを記録したコンピュータ読み取り可能な記録媒体とすることができる。 As a fourteenth means for solving the above problem, the present invention can be a computer-readable recording medium on which the advertising content distribution program of the thirteenth means is recorded.

上述のように、この発明によれば、通話内容が識別されることなく、広告コンテンツが電話端末に配信されるので、不特定多数の者が利用する電話端末に対しても効果的に広告コンテンツを配信することができ、通信の秘密を守ることも可能な電話システム、電話システム管理装置および広告コンテンツ配信方法を提供することができる。 As described above, according to the present invention, since the advertising content is distributed to the telephone terminal without identifying the content of the call, the advertising content can be effectively applied to the telephone terminal used by an unspecified number of persons. Can be distributed, and a telephone system, a telephone system management device, and an advertisement content distribution method capable of protecting the secret of communication can be provided.

この発明に係る電話システム、電話システム管理装置および広告コンテンツ配信方法の実施形態を図面に基づいて説明する。なお、各図面には、この発明に係る電話システム、電話システム管理装置および広告コンテンツ配信方法の構成のうち、この発明に関する部分のみを概念的に示している。 Embodiments of a telephone system, a telephone system management apparatus, and an advertisement content distribution method according to the present invention will be described with reference to the drawings. Each drawing conceptually shows only the part related to the present invention among the configurations of the telephone system, the telephone system management apparatus and the advertisement content distribution method according to the present invention.

図１に示すように、電話システムは、複数の電話端末Ａ，Ｂと、これらの電話端末間の通信を制御する電話システム管理装置１０１と、電話システム管理装置１０１と電話端末Ａ，Ｂを接続する通信網１０２とから構成されている。 As shown in FIG. 1, the telephone system connects a plurality of telephone terminals A and B, a telephone system management apparatus 101 that controls communication between these telephone terminals, and a telephone system management apparatus 101 and telephone terminals A and B. Communication network 102.

電話端末Ａ，Ｂは、テレビカメラと表示画面とを備え、利用中のユーザを写して映像データを生成し、これを音声データと共に送信し、外部入力された映像データ、音声データを再生するテレビ電話機能を有する。電話端末Ａ，Ｂには、例えば、家庭用テレビ電話機、ＩＰ（ＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ）テレビ電話機能を有するパソコン端末、携帯電話端末、ＰＨＳ端末を用いることができる。なお、電話端末Ａ，Ｂには、音声通話のみが可能な通常の電話端末を用いることも可能である。 The telephone terminals A and B are provided with a television camera and a display screen, generate video data by copying a user in use, transmit this together with audio data, and reproduce externally input video data and audio data. Has a telephone function. As the telephone terminals A and B, for example, a home videophone, a personal computer terminal having an IP (Internet Protocol) videophone function, a mobile phone terminal, and a PHS terminal can be used. As the telephone terminals A and B, normal telephone terminals capable of only voice calls can be used.

電話端末Ａ，Ｂは、電話システム管理装置１０１に対しログインすると利用可能になる。このログイン方法としては、電話システム管理装置１０１に予め登録されているユーザＩＤやパスワードを電話端末Ａ，Ｂからユーザに入力・送信させ、これを電話システム管理装置１０１が認証することにより行うことができ、或いは、電話端末Ａ，Ｂから端末固有情報などを自動的に送信させ、これを電話システム管理装置１０１が認証することにより行うこともできる。 The telephone terminals A and B can be used when logging in to the telephone system management apparatus 101. As this login method, a user ID or password registered in advance in the telephone system management apparatus 101 is input and transmitted to the user from the telephone terminals A and B, and this is performed by the telephone system management apparatus 101 authenticating. Alternatively, the terminal-specific information can be automatically transmitted from the telephone terminals A and B, and the telephone system management apparatus 101 can authenticate the terminal-specific information.

通信網１０２には、ＰＳＴＮ（公衆交換電話網）またはＩＰ（ＩｎｔｅｒｎｅｔＰｒｏｔｏｃｏｌ）網を用いており、電話端末Ａ，ＢはＩＰ通信を行う。なお、通信網１０２には、衛星通信網や携帯回線交換網、或いは、これらを統合したネットワークを用いることができ、電話端末Ａ，Ｂの通信プロトコル等も適宜変更することができる。 A PSTN (Public Switched Telephone Network) or IP (Internet Protocol) network is used as the communication network 102, and the telephone terminals A and B perform IP communication. The communication network 102 can be a satellite communication network, a mobile line switching network, or a network in which these are integrated, and the communication protocols of the telephone terminals A and B can be changed as appropriate.

電話システム管理装置１０１は、電話端末Ａ，Ｂ間の通信を制御する機能と、ＩＰ網１０３を介して接続された広告コンテンツ保持装置１０４に予め登録された複数の広告コンテンツＡ，Ｂ，Ｃの中からいずれかの広告コンテンツを選定し、選定した広告コンテンツを利用中の電話端末Ａ，Ｂに配信する機能とを有する。広告コンテンツ保持装置は、外部ストレージ等からなる。なお、この実施形態では、概念的にも、物理的にも、コンテンツ保持装置１０４を電話システムの外部に設けているが、コンテンツ保持装置１０４を電話システム管理装置１０１内に設けることもできる。また、各広告コンテンツとしては、動画コンテンツ、静止画コンテンツ、音声コンテンツ、これらを組合せたコンテンツなどがある。また、電話システム管理装置１０１と広告コンテンツ保持装置１０４との接続には、ＩＰ網、ＰＳＴＮ網、または専用回線のいずれかを含むネットワークを用いることができる。特に、広く普及している既存の電話システムを利用することができる点で、ネットワークには、ＩＰ網及びＰＳＴＮのうち、少なくともいずれか一方を含むものを用いることが好ましい。勿論、ＩＰ網及びＰＳＴＮの両方を含むネットワークを用いることも可能である。 The telephone system management apparatus 101 has a function of controlling communication between the telephone terminals A and B and a plurality of advertisement contents A, B, and C registered in advance in the advertisement content holding apparatus 104 connected via the IP network 103. It has a function of selecting any one of the advertisement contents from among them and distributing the selected advertisement contents to the telephone terminals A and B that are in use. The advertisement content holding device includes an external storage or the like. In this embodiment, the content holding device 104 is provided outside the telephone system, both conceptually and physically, but the content holding device 104 may be provided in the telephone system management device 101. Each advertisement content includes moving image content, still image content, audio content, content combining these, and the like. Further, a network including any of an IP network, a PSTN network, or a dedicated line can be used for connection between the telephone system management apparatus 101 and the advertisement content holding apparatus 104. In particular, it is preferable to use a network including at least one of an IP network and a PSTN in that an existing telephone system that is widely spread can be used. Of course, it is possible to use a network including both an IP network and a PSTN.

電話システム管理装置１０１は、概念的に、通信制御部１０５と、ユーザ認証部１０６と、映像識別部１０７と、音声識別部１０８と、主制御部１０９と、コンテンツ制御部１１０と、ミキシング部１１１と、ユーザ管理部１１２と、インセンティブ情報部１１３とを備える。 The telephone system management apparatus 101 conceptually includes a communication control unit 105, a user authentication unit 106, a video identification unit 107, a voice identification unit 108, a main control unit 109, a content control unit 110, and a mixing unit 111. And a user management unit 112 and an incentive information unit 113.

通信制御部１０５は、電話端末Ａ，Ｂ間の通信を交換する通信交換機能と、電話システム管理装置１０１と電話端末Ａ，Ｂとの通信を制御する機能とを有する。 The communication control unit 105 has a communication exchange function for exchanging communication between the telephone terminals A and B, and a function for controlling communication between the telephone system management apparatus 101 and the telephone terminals A and B.

ユーザ認証部１０６は、電話端末Ａ，Ｂから受信したユーザＩＤやパスワード等のログイン情報に基づいて認証を実施するようになっている。 The user authentication unit 106 performs authentication based on login information such as user IDs and passwords received from the telephone terminals A and B.

映像識別部１０７は、電話端末Ａ，Ｂからこの電話端末を利用するユーザを写した映像が送信された場合にこのユーザのユーザ属性要素をこの映像より推定する機能を有し、通信交換する電話端末Ａ，Ｂごとに、性別、年代、人種、化粧嗜好、装飾品嗜好などのユーザの映像から識別され得るユーザ属性要素を推定するようになっている。 The video identification unit 107 has a function of estimating a user attribute element of a user from the video when a video showing a user using the telephone terminal is transmitted from the telephone terminals A and B, and a telephone for communication exchange For each of the terminals A and B, user attribute elements that can be identified from the video of the user such as sex, age, race, makeup preference, and ornament preference are estimated.

映像識別部１０７では、従来から用いられている顔形状特徴の主成分識別（ＰｒｉｎｃｉｐａｌＣｏｍｐｏｎｅｎｔＡｎａｌｙｓｉｓ、以下、ＰＣＡという）等により、入力された映像が識別され、ユーザ属性要素が推定される。例えば、電子情報通信学会技術研究報告，ＰＲＭＵ２００３−１４２，ｐｐ．１３−１８，Ｎｏｖ．の方式により、ガウス混合モデル（ＧａｕｓｉａｎＭｉｘｔｕｒｅＭｏｄｅｌ、以下、ＧＭＭという）を使用した尤度計算を行い、ユーザの性別の尤度が求められる。なお、性別の他に尤度計算を実施するユーザ属性要素としては、例えば、年代、口紅の色、アイシャドーの色などの映像識別要素より求められる化粧品の嗜好傾向、眼鏡、イヤリング、ネックレスの有無やこれらのカラーリングなどの映像識別要素より求められる装飾品の嗜好傾向、肌や毛髪の色などの映像識別要素より求められるユーザの人種、国籍等がある。また、これらの映像識別要素の複合的な尤度確率から、上限確率によるユーザ属性要素の尤度を求めることもできる。以下に、尤度計算に基づいてユーザの性別を推定する場合を例にとり、映像識別部１０７におけるユーザ属性要素の推定処理を説明する。 In the video identification unit 107, the input video is identified by the principal component identification (hereinafter referred to as PCA) of face shape features that has been conventionally used, and user attribute elements are estimated. For example, IEICE Technical Report, PRMU2003-142, pp. 13-18, Nov. By this method, likelihood calculation using a Gaussian mixture model (hereinafter referred to as GMM) is performed, and the likelihood of the gender of the user is obtained. In addition to gender, user attribute elements for which likelihood calculation is performed include, for example, preference for cosmetics required from video identification elements such as age, lipstick color, eye shadow color, presence of glasses, earrings, necklace There is a tendency of preference for ornaments required from video identification elements such as coloring, and the race and nationality of users required from video identification elements such as skin and hair color. Further, the likelihood of the user attribute element based on the upper limit probability can be obtained from the composite likelihood probability of these video identification elements. The user attribute element estimation processing in the video identification unit 107 will be described below by taking as an example the case of estimating the gender of a user based on likelihood calculation.

先ず、映像識別部１０７は、次に示す数式１によりユーザの性別の尤度を求める。 First, the video identification unit 107 obtains the likelihood of the user's gender according to Equation 1 shown below.

ｙはＰＣＡにより識別した顔形状特徴パラメータ、ｖはＧＭＭパラメータである。性別の尤度計算の場合、映像識別部１０７は、ｖは男性ＧＭＭパラメータおよび女性ＧＭＭパラメータのそれぞれにおいて尤度ν’を求め、そこから男性である確率ｍ（ｍ）、女性である確率ｍ（ｆ）、識別不能の確率ｍ（ｍ・ｆ）を算出する。基本確率算出は、例えば、Ｄｅｍｐｓｔｅｒ−Ｓｈａｆｅｒ理論の尤度分布図から求めることができる。さらに、映像識別部１０７は、男性、女性のＧＭＭパラメータより求めた各基本確率を、次に示す数式２を用いて統合する。 y is a face shape feature parameter identified by PCA, and v is a GMM parameter. In the case of gender likelihood calculation, the video identification unit 107 obtains the likelihood ν ′ in each of the male GMM parameter and the female GMM parameter, from which the probability m (m) that is male and the probability m (m that is female). f) The probability m (m · f) of indistinguishment is calculated. The basic probability calculation can be obtained, for example, from the likelihood distribution diagram of the Demster-Shafer theory. Further, the video identification unit 107 integrates the basic probabilities obtained from the male and female GMM parameters using Equation 2 shown below.

最後に、映像識別部１０７は、数式２により求めた確率を比較することにより、ユーザの性別を決定する。なお、上記数式１および数式２を用いた尤度算出は一例であり、他の尤度算出方式を用いることもできる。 Finally, the video identification unit 107 determines the gender of the user by comparing the probabilities obtained by Equation 2. Note that the likelihood calculation using Equation 1 and Equation 2 is an example, and other likelihood calculation methods can be used.

音声識別部１０８は、電話端末Ａ，Ｂを利用するユーザのユーザ属性要素をこのユーザの音声より推定する機能を有し、通信交換する電話端末Ａ，Ｂごとに、性別、年代、人種などのユーザの音声から識別され得るユーザ属性要素を推定するようになっている。 The voice identification unit 108 has a function of estimating a user attribute element of a user who uses the telephone terminals A and B from the voice of the user. For each telephone terminal A and B to be exchanged, sex, age, race, etc. User attribute elements that can be identified from the voices of the users are estimated.

音声識別部１０８では、従来から用いられている音響識別アルゴリズム等により、入力された音声が識別され、ユーザ属性要素が推定される。例えば、粕谷他，“年齢、性別による日本語５母音のピッチ周波数とホルマント周波数の変化”，音響学会誌２４，６（１９６８）の方式により、音響的特徴を抽出し、音響モデルを基にその性別や年代等のユーザ属性要素の尤度が求められる。以下に、尤度計算に基づいてユーザの性別を推定する場合を例にとり、音声識別部１０８におけるユーザ属性要素の推定処理を説明する。なお、以下と同様にしてユーザの年代等を推定することができる。 In the voice identification unit 108, the input voice is identified by a conventionally used acoustic identification algorithm or the like, and a user attribute element is estimated. For example, Kakutani et al., “Changes in pitch frequency and formant frequency of Japanese vowels by age and gender”, Acoustical Society 24, 6 (1968), extracts acoustic features, and based on acoustic models The likelihood of user attribute elements such as gender and age is required. The user attribute element estimation process in the speech identification unit 108 will be described below by taking as an example a case where the gender of the user is estimated based on likelihood calculation. The user's age and the like can be estimated in the same manner as described below.

先ず、音声識別部１０８は、次に示す数式３によりユーザの性別の尤度を求める。 First, the voice identification unit 108 obtains the likelihood of the user's gender according to the following Equation 3.

ｘはＰＣＡにより識別した音響的特徴パラメータ、ｗは隠れマルコフモデル（ＨｉｄｄｅｎＭａｒｋｏｖＭｏｄｅｌ以下、ＨＭＭという）パラメータである。性別の尤度計算の場合、音声識別部１０８は、ｗは男性ＨＭＭパラメータおよび女性ＨＭＭパラメータのそれぞれにおいて尤度ｗ’を求め、そこから男性である確率ｍ（ｍ）、女性である確率ｍ（ｆ）、識別不能の確率ｍ（ｍ・ｆ）を算出する。基本確率算出は、例えば、Ｄｅｍｐｓｔｅｒ−Ｓｈａｆｅｒ理論の尤度分布図から求めることができる。さらに、音声識別部１０８は、男性、女性のＨＭＭパラメータより求めた各基本確率を、上記数式２を用いて統合する。最後に、音声識別部１０８は、数式２より求めた確率を比較することにより、ユーザの性別を決定する。 x is an acoustic feature parameter identified by PCA, and w is a hidden Markov model (hereinafter referred to as HMM) parameter. In the case of gender likelihood calculation, the speech identification unit 108 obtains the likelihood w ′ in each of the male HMM parameter and the female HMM parameter, from which the probability m (m) that is male and the probability m (m that is female). f) The probability m (m · f) of indistinguishment is calculated. The basic probability calculation can be obtained, for example, from the likelihood distribution diagram of the Demster-Shafer theory. Further, the voice identification unit 108 integrates the basic probabilities obtained from the male and female HMM parameters using the above-described Expression 2. Finally, the voice identification unit 108 determines the gender of the user by comparing the probabilities obtained from Equation 2.

また、音声識別部１０８では、「多言語音声識別エンジンの開発」（日本音響学会講演論文集，ｐｐ．１１５−１１６，Ｍａｒｃｈ２００２）或は特願２００２−２６１６７２号等により、言語を特定し、ユーザの国籍について推定するようにしてもよい。 Further, the speech identification unit 108 specifies a language by “development of multilingual speech identification engine” (acoustics of the Acoustical Society of Japan, pp. 115-116, March 2002) or Japanese Patent Application No. 2002-261672, etc. You may make it estimate about a user's nationality.

主制御部１０９は、電話システム管理装置１０１と電話端末Ａ，Ｂや広告コンテンツ保持装置に対する処理、電話システム管理装置１０１の各部における処理の指示、処理結果の登録および受け渡し、電話端末の呼び出し等を実行し、電話システムの各構成を制御する。 The main control unit 109 performs processing on the telephone system management apparatus 101 and the telephone terminals A and B and the advertising content holding apparatus, processing instructions in each part of the telephone system management apparatus 101, registration and delivery of processing results, calling of the telephone terminal, and the like. Run and control each configuration of the phone system.

また、主制御部１０９は、電話端末ごとに、映像識別部１０７により推定されたユーザ属性要素に応じて予め登録された複数の広告コンテンツＡ，Ｂ，Ｃの中からいずれかの広告コンテンツを選定する機能と、音声識別部１０８により推定されたユーザ属性要素に応じて予め登録された複数の広告コンテンツＡ，Ｂ，Ｃの中からいずれかの広告コンテンツを選定する機能と、映像識別部１０７と音声識別部１０８を併用してユーザ属性要素を推定し、推定されたユーザ属性要素に応じて広告コンテンツ保持装置１０４に予め登録された複数の広告コンテンツＡ，Ｂ，Ｃの中からいずれかの広告コンテンツを選定する機能とを有する。以下、主制御部１０９における広告コンテンツの選定処理の一例を説明する。 Further, the main control unit 109 selects any one of the advertisement contents A, B, and C registered in advance according to the user attribute element estimated by the video identification unit 107 for each telephone terminal. A function for selecting one of the advertisement contents A, B, and C registered in advance according to the user attribute element estimated by the voice identification unit 108, a video identification unit 107, A user attribute element is estimated by using the voice identification unit 108 in combination, and one of advertisements A, B, and C registered in advance in the advertisement content holding device 104 according to the estimated user attribute element And a function for selecting contents. Hereinafter, an example of advertisement content selection processing in the main control unit 109 will be described.

先ず、主制御部１０９は、映像識別部１０７において求められた各ユーザ属性要素に対する基本尤度確率（推定結果）と、音声識別部１０８において求められた各ユーザ属性要素に対する基本尤度確率（推定結果）とを用いて、同一のユーザ属性要素ごとに、ユーザ属性要素に対する上限確率を算出する。 First, the main control unit 109 calculates a basic likelihood probability (estimation result) for each user attribute element obtained by the video identification unit 107 and a basic likelihood probability (estimation) for each user attribute element obtained by the speech identification unit 108. Result) and the upper limit probability for the user attribute element is calculated for each identical user attribute element.

すなわち、主制御部１０９は、映像識別部１０７において求められた各ユーザ属性要素に対する基本尤度確率と、音声識別部１０８において求められた各ユーザ属性要素に対する基本尤度確率とを数式２を用いて統合し、統合した尤度確率を基に、次に示す数式４を用いて上限確立を算出する。 That is, the main control unit 109 uses Equation 2 to calculate the basic likelihood probability for each user attribute element obtained in the video identification unit 107 and the basic likelihood probability for each user attribute element obtained in the speech identification unit 108. Then, based on the integrated likelihood probability, the upper limit establishment is calculated using Equation 4 below.

主制御部１０９は、数式４により求められた上限確率の結果から、ユーザの性別等の各ユーザ属性要素を決定する。そして、主制御部１０９は、決定したユーザ属性要素と最も近い広告対象者の属性要素が関連付けられた広告コンテンツを広告コンテンツＡ，Ｂ，Ｃの中から選定する。具体的には、広告コンテンツＡ，Ｂ，Ｃを、主制御部１０９で決定されるユーザ属性要素のいずれかを用いて、或は組合せて、広告対象者の属性要素として関連付けた形で広告コンテンツ保持装置１０４に登録しておき、主制御部１０９が、決定した各ユーザ属性要素に基づいて各広告コンテンツの広告対象者の属性要素を参照し、最も近い広告対象者の属性要素を関連付けられた広告コンテンツを選ぶようになっている。例えば、主制御部１０９が、電話端末Ａのユーザ属性要素のうち、（性別）を“男性”、（年代）を“３０代”と決定した場合において、広告コンテンツＡの広告対象者の属性要素が、（性別）を“男性”、（年代）を“２０代”と関連付けられ、広告コンテンツＢの広告対象者の属性要素が、（性別）を“男性”、（年代）を“３０代”と関連付けられ、広告コンテンツＣの広告対象者の属性要素が、（性別）を“女性”、（年代）を“３０代”と関連付けられているとすると、主制御部１０９は、電話端末Ａに配信する広告コンテンツに広告コンテンツＢを選ぶ。 The main control unit 109 determines each user attribute element such as the user's gender from the result of the upper limit probability obtained by Expression 4. Then, the main control unit 109 selects, from the advertising contents A, B, and C, the advertising content in which the attribute element of the advertising subject closest to the determined user attribute element is associated. Specifically, the advertisement contents A, B, and C are associated with each other as an attribute element of the advertising target by using or combining any one of the user attribute elements determined by the main control unit 109. Registered in the holding device 104, the main control unit 109 refers to the attribute element of the advertising target person of each advertising content based on each determined user attribute element, and is associated with the closest advertising target person attribute element Ad content is selected. For example, when the main control unit 109 determines that the “sex” is “male” and the “age” is “30s” among the user attribute elements of the telephone terminal A, the attribute element of the advertising target person of the advertising content A However, (male) is associated with “male”, (age) is associated with “20s”, and the attribute elements of the advertising target of advertisement content B are (male) as “male” and (age) as “30s” If the attribute elements of the advertising target of the advertising content C are associated with (female) as “female” and (age) as “30s”, the main control unit 109 The advertisement content B is selected as the advertisement content to be distributed.

ここで、主制御部１０９は、電話端末Ａ，Ｂの端末環境に応じて、映像識別部１０７による推定処理、音声識別部１０８による推定処理、および音声識別部１０７と映像識別部１０８の併用による推定処理のうち、いずれかの推定処理を、電話端末Ａ，Ｂの端末環境に応じて選択し実行するようになっている。具体的には、主制御部１０９は、電話端末Ａ，ＢごとにＩＰテレビ電話機能による映像送信の有無を判断し、映像送信のない電話端末については、映像識別部１０７による推定処理を自動的にスキップする。この場合、配信される広告コンテンツは、専用の音声コンテンツや、動画コンテンツで使用されたオーディオ情報部分等になる。なお、主制御部１０９は、映像識別部１０７による推定処理を選択した場合、映像識別部１０７によって算出された基本尤度確率を前述の上限確率として各ユーザ属性要素を決定し、広告コンテンツを選定すると共に、音声識別部１０８による推定処理を選択した場合、音声識別部１０８よって算出された基本尤度確率を上限確率として各ユーザ属性要素を決定し、広告コンテンツを選定するようになっている。 Here, the main control unit 109 uses an estimation process by the video identification unit 107, an estimation process by the audio identification unit 108, and a combination of the audio identification unit 107 and the video identification unit 108 according to the terminal environments of the telephone terminals A and B. Among the estimation processes, one of the estimation processes is selected and executed according to the terminal environment of the telephone terminals A and B. Specifically, the main control unit 109 determines the presence / absence of video transmission by the IP videophone function for each of the telephone terminals A and B, and automatically performs estimation processing by the video identification unit 107 for telephone terminals without video transmission. Skip to. In this case, the advertisement content to be distributed is a dedicated audio content, an audio information portion used in the moving image content, or the like. When the estimation process by the video identification unit 107 is selected, the main control unit 109 determines each user attribute element using the basic likelihood probability calculated by the video identification unit 107 as the above-described upper limit probability, and selects advertisement content. In addition, when the estimation process by the voice identification unit 108 is selected, each user attribute element is determined with the basic likelihood probability calculated by the voice identification unit 108 as an upper limit probability, and the advertisement content is selected.

コンテンツ制御部１１０は、コンテンツ保持装置１０４との間で各広告コンテンツの送受信を制御すると共に、コンテンツ保持装置１０４に送信させた各広告コンテンツを蓄積するデータベース（以下、ＤＢという）を有し、主制御部１０９の指示に応じて主制御部１０９が選定した広告コンテンツをミキシング部１１１に送る。なお、広告コンテンツの蓄積は、コンテンツ保持装置１０４において広告コンテンツが更新された時に実行される。 The content control unit 110 controls transmission / reception of each advertising content to / from the content holding device 104 and has a database (hereinafter referred to as DB) for storing each advertising content transmitted to the content holding device 104. The advertising content selected by the main control unit 109 in response to an instruction from the control unit 109 is sent to the mixing unit 111. The accumulation of advertisement content is executed when the advertisement content is updated in the content holding device 104.

ミキシング部１１１は、コンテンツ制御部１１０から受け取った広告コンテンツの再生処理を行い、主制御部１０９の指示に応じて再生処理を施した広告コンテンツを配信先の電話端末に送る。 The mixing unit 111 performs a reproduction process of the advertisement content received from the content control unit 110, and sends the advertisement content subjected to the reproduction process in response to an instruction from the main control unit 109 to a distribution destination telephone terminal.

ユーザ管理部１１２は、概念的に、ユーザ情報１１４と、映像識別情報１１５と、音声識別情報１１６と、最尤度情報１１７をＤＢとして有する。 The user management unit 112 conceptually includes user information 114, video identification information 115, audio identification information 116, and maximum likelihood information 117 as a DB.

図２に概念的に示すように、ユーザ情報１１４は、電話システムの保守者によって登録される保守者登録部と、電話システム管理装置１０１が自動的に登録するシステム自動登録部と、ユーザが登録するユーザ登録部とからなる。 As conceptually shown in FIG. 2, the user information 114 includes a maintenance person registration unit registered by a telephone system maintenance person, a system automatic registration unit automatically registered by the telephone system management apparatus 101, and a user registration. A user registration unit.

保守者登録部は、ユーザＩＤ（Ｕ１）と、パスワード（Ｕ２）と、ＵＲＩ（ＵｎｉｆｏｒｍＲｅｓｏｕｒｃｅＩｄｅｎｔｉｆｉｅｒ）（Ｕ３）と、電話番号（Ｕ４）とを項目に有する。各項目は、電話システムの保守者と電話端末Ａ，Ｂを利用するユーザとの間で広告コンテンツの視聴によるインセンティブ広告サービスの契約が成立した後、保守者によってユーザごとに固有の情報となるように登録される。 The maintenance person registration unit includes a user ID (U1), a password (U2), a URI (Uniform Resource Identifier) (U3), and a telephone number (U4). Each item becomes unique information for each user by the maintenance person after a contract for incentive advertising service by viewing advertisement contents is established between the maintenance person of the telephone system and the user who uses the telephone terminals A and B. Registered in

システム自動登録部は、ＩＰアドレス（Ｕ５）と、配信コンテンツ識別子（Ｕ６）と、ユーザステータス（Ｕ７）とを項目に有する。ＩＰアドレス（Ｕ５）のセルには、各電話端末Ａ，ＢのＩＰアドレスが電話端末Ａ，Ｂの利用開始時に登録される。配信コンテンツ識別子（Ｕ６）のセルには、電話端末Ａ，Ｂに配信された広告コンテンツのコンテンツ識別子（Ａ，Ｂ，Ｃ）が広告コンテンツの配信完了時に登録される。ユーザステータス（Ｕ７）のセルには、「ログイン」、「ログアウト」、「通話中」といった電話端末Ａ，Ｂの状態が登録される。 The system automatic registration unit includes an IP address (U5), a distribution content identifier (U6), and a user status (U7). In the cell of the IP address (U5), the IP addresses of the telephone terminals A and B are registered at the start of use of the telephone terminals A and B. In the distribution content identifier (U6) cell, the content identifier (A, B, C) of the advertisement content distributed to the telephone terminals A and B is registered when the distribution of the advertisement content is completed. In the cell of the user status (U7), the states of the telephone terminals A and B such as “login”, “logout”, and “busy” are registered.

ユーザ登録部は、ユーザネーム（Ｕ８）と、配信要求（Ｕ９）とを項目に有する。ユーザネーム（Ｕ８）のセルには、電話端末Ａ，Ｂを利用するユーザの名前が登録される。配信要求（Ｕ９）のセルには、電話端末Ａ，Ｂにおける広告コンテンツの配信要求の有無が登録される。 The user registration unit includes a user name (U8) and a distribution request (U9). The name of the user who uses the telephone terminals A and B is registered in the cell of the user name (U8). In the cell of the distribution request (U9), the presence / absence of a distribution request for advertising content in the telephone terminals A and B is registered.

図３に概念的に示すように、映像識別情報１１５は、ユーザＩＤ（Ｕ１）と、性別の尤度（Ｕ１１）と、年代の尤度（Ｕ１２）と、国籍の尤度（Ｕ１３）といったユーザ属性要素ごとの尤度（要素Ｚ）と、コンテンツ識別子（Ｕ１４）とを項目に有する。ユーザＩＤ（Ｕ１）のセルには、映像識別部１０７で識別された映像の送信元の電話端末を利用するユーザのユーザＩＤが登録される。性別の尤度（Ｕ１１）と、年代の尤度（Ｕ１２）と、国籍の尤度（Ｕ１３）といったユーザ属性要素ごとの尤度（要素Ｚ）のセルには、映像識別部１０７により算出された尤度がそれぞれ登録される。コンテンツ識別子（Ｕ１４）のセルには、映像識別部１０７による推定処理に基づいて主制御部１０９により選定された広告コンテンツのコンテンツ識別子が登録される。 As conceptually shown in FIG. 3, the video identification information 115 includes user ID (U1), gender likelihood (U11), age likelihood (U12), and nationality likelihood (U13). Items include a likelihood (element Z) for each attribute element and a content identifier (U14). In the user ID (U1) cell, a user ID of a user who uses the telephone terminal that is the transmission source of the video identified by the video identification unit 107 is registered. The cell of likelihood (element Z) for each user attribute element such as gender likelihood (U11), age likelihood (U12), and nationality likelihood (U13) was calculated by the video identification unit 107. Each likelihood is registered. In the content identifier (U14) cell, the content identifier of the advertising content selected by the main control unit 109 based on the estimation process by the video identification unit 107 is registered.

図４に概念的に示すように、音声識別情報１１６は、ユーザＩＤ（Ｕ１）と、性別の尤度（Ｕ２１）と、年代の尤度（Ｕ２２）と、国籍の尤度（Ｕ２３）といったユーザ属性要素ごとの尤度（要素Ｚ）と、コンテンツ識別子（Ｕ２４）とを項目に有する。ユーザＩＤ（Ｕ１）のセルには、音声識別部１０８で識別された音声の送信元の電話端末を利用するユーザのユーザＩＤが登録される。性別の尤度（Ｕ２１）と、年代の尤度（Ｕ２２）と、国籍の尤度（Ｕ２３）といったユーザ属性要素ごとの尤度（要素Ｚ）のセルには、音声識別部１０８により算出された尤度がそれぞれ登録される。コンテンツ識別子（Ｕ２４）のセルには、音声識別部１０８による推定処理に基づいて主制御部１０９により選定された広告コンテンツのコンテンツ識別子が登録される。 As conceptually shown in FIG. 4, the voice identification information 116 includes user ID (U1), gender likelihood (U21), age likelihood (U22), and nationality likelihood (U23). Items include a likelihood (element Z) for each attribute element and a content identifier (U24). In the user ID (U1) cell, the user ID of the user who uses the telephone terminal that is the voice transmission source identified by the voice identification unit 108 is registered. The cell of likelihood (element Z) for each user attribute element such as gender likelihood (U21), age likelihood (U22), and nationality likelihood (U23) is calculated by the voice identification unit 108. Each likelihood is registered. In the cell of the content identifier (U24), the content identifier of the advertising content selected by the main control unit 109 based on the estimation process by the voice identification unit 108 is registered.

図５に概念的に示すように、最尤度情報１１７は、ユーザＩＤ（Ｕ１）と、性別の最尤度（Ｕ３１）と、年代の最尤度（Ｕ３２）と、国籍の最尤度（Ｕ３３）といったユーザ属性要素ごとの最尤度（要素Ｚ）と、コンテンツ識別子（Ｕ３４）とを項目に有する。ユーザＩＤ（Ｕ１）のセルには、映像識別部１０７および音声識別部１０８で識別された映像および音声の送信元の電話端末を利用するユーザのユーザＩＤが登録される。性別の最尤度（Ｕ３１）と、年代の最尤度（Ｕ３２）と、国籍の最尤度（Ｕ３３）といったユーザ属性要素ごとの最尤度（要素Ｚ）のセルには、音声識別部１０７と映像識別部１０８の併用による推定処理が実行された場合、主制御部１０９により算出された上限確率が登録され、コンテンツ識別子（Ｕ３４）のセルには、その上限確率に基づいて主制御部１０９により選定された広告コンテンツのコンテンツ識別子が登録される。また、主制御部１０９が映像識別部１０７による推定処理を選択した場合、性別の最尤度（Ｕ３１）等には、映像識別部１０７によって算出された基本尤度確率が前述の上限確率として登録され、コンテンツ識別子（Ｕ３４）のセルには、映像識別情報１１５のコンテンツ識別子（Ｕ１４）のコンテンツ識別子が登録される。主制御部１０９が音声識別部１０８による推定処理を選択した場合も同様である。 As conceptually shown in FIG. 5, the maximum likelihood information 117 includes a user ID (U1), a gender maximum likelihood (U31), an age maximum likelihood (U32), and a nationality maximum likelihood ( The items include a maximum likelihood (element Z) for each user attribute element such as (U33) and a content identifier (U34). In the cell of the user ID (U1), the user ID of the user who uses the telephone terminal of the video and audio transmission source identified by the video identification unit 107 and the audio identification unit 108 is registered. In the cell of the maximum likelihood (element Z) for each user attribute element such as the maximum likelihood (U31) of the sex, the maximum likelihood of the age (U32), and the maximum likelihood of the nationality (U33), a voice identification unit 107 is provided. And the video identification unit 108 are combined, the upper limit probability calculated by the main control unit 109 is registered, and the main control unit 109 is registered in the cell of the content identifier (U34) based on the upper limit probability. The content identifier of the advertising content selected by is registered. Further, when the main control unit 109 selects the estimation process by the video identification unit 107, the basic likelihood probability calculated by the video identification unit 107 is registered as the above-described upper limit probability in the gender maximum likelihood (U31) and the like. Then, the content identifier of the content identifier (U14) of the video identification information 115 is registered in the cell of the content identifier (U34). The same applies when the main control unit 109 selects the estimation process by the voice identification unit 108.

図６に概念的に示すように、インセンティブ情報部１１３は、日付（Ｂ１０）と、コールＩＤ（Ｂ１１）と、発ＩＰアドレス（Ｂ１２）と、発ＵＲＩ／電話番号（Ｂ１３）と、着ＩＰアドレス（Ｂ１４）と、着ＵＲＩ／電話番号（Ｂ１５）と、開始時間（Ｂ１６）と、終了時間（Ｂ１７）と、コンテンツ識別子（Ｂ１８）と、コンテンツ再生開始時間（Ｂ１９）と、コンテンツ再生終了時間（Ｂ２０）とを項目に有する。 As conceptually shown in FIG. 6, the incentive information unit 113 includes the date (B10), the call ID (B11), the originating IP address (B12), the originating URI / phone number (B13), and the destination IP address. (B14), incoming URI / phone number (B15), start time (B16), end time (B17), content identifier (B18), content playback start time (B19), and content playback end time ( B20) is included in the item.

日付（Ｂ１０）のセルには、通話セッションが行われた日付が登録される。コールＩＤ（Ｂ１１）のセルには、通話セッションごとに付与された固有のコールＩＤが登録される。発ＩＰアドレス（Ｂ１２）のセルには、通話セッションを行った電話端末Ａ，Ｂのうち、発呼側の電話端末のＩＰアドレスが登録される。発ＵＲＩ／電話番号（Ｂ１３）のセルには、発呼側の電話端末のＵＲＩ／電話番号が登録される。一方、着ＩＰアドレス（Ｂ１４）のセルには、着呼側の電話端末のＩＰアドレスが登録され、着ＵＲＩ／電話番号（Ｂ１５）のセルには、着呼側の電話端末のＵＲＩ／電話番号が登録される。開始時間（Ｂ１６）のセルには、通話セッションを開始した時間が登録され、終了時間（Ｂ１７）のセルには、通話セッションを終了した時間が登録される。コンテンツ識別子（Ｂ１８）のセルには、電話端末Ａ，Ｂに配信された広告コンテンツのコンテンツ識別子が登録される。コンテンツ再生開始時間（Ｂ１９）のセルには、配信された広告コンテンツの再生を開始した時間が登録され、コンテンツ再生終了時間（Ｂ２０）のセルには、広告コンテンツの再生を終了した時間が登録される。 In the date (B10) cell, the date when the call session was performed is registered. A unique call ID assigned to each call session is registered in the cell of the call ID (B11). In the cell of the calling IP address (B12), the IP address of the calling telephone terminal among the telephone terminals A and B that performed the call session is registered. In the cell of the calling URI / phone number (B13), the URI / phone number of the calling telephone terminal is registered. On the other hand, the IP address of the called telephone terminal is registered in the cell of the called IP address (B14), and the URI / phone number of the called telephone terminal is registered in the cell of the called URI / phone number (B15). Is registered. The time when the call session is started is registered in the cell of the start time (B16), and the time when the call session is ended is registered in the cell of the end time (B17). In the content identifier (B18) cell, the content identifier of the advertising content distributed to the telephone terminals A and B is registered. In the cell of the content playback start time (B19), the time when the distributed advertisement content starts to be played is registered, and in the cell of the content playback end time (B20), the time when the playback of the advertising content ends is registered. The

上記構成を概念的に有する電話システムを用いた広告コンテンツ配信方法を以下に分説する。なお、広告コンテンツの配信は、通信開始前或いは呼び出し時、或いは保留時に配信することができるが、ここでは、一例として、電話保留時に広告コンテンツを配信する場合を説明する。なお、この例では、ユーザ管理部１１２のユーザ情報１１４のうち、保守者登録部（Ｕ１〜Ｕ４）とユーザ登録部（Ｕ８，Ｕ９）の項目には、所定の情報が予め登録されている。 An advertisement content distribution method using a telephone system conceptually having the above configuration will be described below. The advertisement content can be distributed before the start of communication, at the time of calling, or at the time of holding, but here, as an example, a case where the advertising content is distributed at the time of holding the telephone will be described. In this example, in the user information 114 of the user management unit 112, predetermined information is registered in advance in the items of the maintenance person registration unit (U1 to U4) and the user registration unit (U8, U9).

先ず、電話端末Ａのログインから、電話端末Ａを利用するユーザを写した映像が送信された場合にこのユーザのユーザ属性要素をこの映像より推定する映像識別を経て、映像識別に基づいた広告コンテンツの選定が完了するまでを、図７に基づいて説明する。 First, when a video showing a user using the telephone terminal A is transmitted from the login of the telephone terminal A, advertisement content based on the video identification is obtained through video identification in which the user attribute element of the user is estimated from the video. The process until the selection is completed will be described with reference to FIG.

電話端末Ａは、電話システム管理装置１０１に対しパスワード（Ｕ２）等のログイン情報を送る（Ｓ１）。 The telephone terminal A sends login information such as a password (U2) to the telephone system management apparatus 101 (S1).

主制御部１０９は、受け取ったログイン情報をユーザ認証部１０６に送り、ユーザ認証部１０６に電話端末Ａの認証指示を行う（Ｓ２）。 The main control unit 109 sends the received login information to the user authentication unit 106 and instructs the user authentication unit 106 to authenticate the telephone terminal A (S2).

ユーザ認証部１０６は、電話端末Ａからのログイン情報に基づいて認証を実施し、認証結果を主制御部１０９に送る（Ｓ３）。 The user authentication unit 106 performs authentication based on the login information from the telephone terminal A, and sends the authentication result to the main control unit 109 (S3).

ユーザ認証部１０６からの認証結果を受けた主制御部１０９は、電話端末Ａがログインを認証された場合には、電話端末ＡのＩＰアドレス（Ｕ５）、ユーザステータス（Ｕ７）等をユーザ管理部１１２のユーザ情報１１４に登録する（Ｓ４）。 Upon receiving the authentication result from the user authentication unit 106, the main control unit 109, when the telephone terminal A is authenticated for login, displays the IP address (U5), user status (U7), etc. of the telephone terminal A as a user management unit. The user information 114 of 112 is registered (S4).

主制御部１０９は、ユーザ情報１１４に必要な登録を完了すると、電話端末Ａに対し、ユーザを写した映像の送信指示を行い、これを受けて、電話端末Ａは、音声案内等によりユーザに対しユーザ像の撮影を促す（Ｓ５）。 When the main control unit 109 completes the registration necessary for the user information 114, the main control unit 109 instructs the telephone terminal A to transmit a video showing the user, and the telephone terminal A receives the instruction by voice guidance or the like. The user is prompted to take a user image (S5).

主制御部１０９は、電話端末Ａから映像が送信された場合、その映像を映像識別部１０７に送る。これを受けて、映像識別部１０７は、映像中のユーザ像の位置から電話端末Ａのユーザが一人か複数人かを判断し、その結果、ユーザを一人と判断した場合、そのユーザのユーザ属性要素の推定処理を行い、電話端末Ａのユーザを複数人と判断した場合、各ユーザのユーザ属性要素を推定する（Ｓ６）。なお、主制御部１０９は、電話端末Ａから映像送信がない場合、電話端末Ａに関し映像識別部１０７による推定処理（Ｓ６）をスキップする。このスキップ処理は、（Ｓ５）の処理完了後から一定時間経過後に実行される。 When the video is transmitted from the telephone terminal A, the main control unit 109 sends the video to the video identification unit 107. In response, the video identification unit 107 determines whether the user of the telephone terminal A is one or more from the position of the user image in the video. As a result, if the user is determined to be one, the user attribute of the user is determined. When element estimation processing is performed and it is determined that there are a plurality of users of the telephone terminal A, a user attribute element of each user is estimated (S6). When there is no video transmission from the telephone terminal A, the main control unit 109 skips the estimation process (S6) by the video identification unit 107 for the telephone terminal A. This skip process is executed after a predetermined time has elapsed since the completion of the process of (S5).

次に、映像識別部１０７は、ユーザ属性要素の推定結果（算出した尤度等）を映像識別情報１１５に作成されたユーザＩＤ（Ｕ１）のレコードに登録する。（Ｓ７）。ここで、映像識別部１０７は、（Ｓ６）において各ユーザのユーザ属性要素を推定した場合、各ユーザのユーザ属性要素を同一のユーザ属性要素ごとに尤度統合し、統合後の各ユーザ属性要素を電話端末Ａのユーザのものとして登録する。 Next, the video identification unit 107 registers the estimation result (such as the calculated likelihood) of the user attribute element in the record of the user ID (U1) created in the video identification information 115. (S7). Here, when the user attribute element of each user is estimated in (S6), the video identification unit 107 integrates the likelihood of the user attribute element of each user for each identical user attribute element, and each user attribute element after the integration As a user of the telephone terminal A.

ところで、上記統合処理では、各ユーザ間で性別や年代等にバラツキがある場合、ユーザ属性要素が識別不能になることも起こりうる。このような場合、映像識別部１０７は、上記の統合処理を中止し、各ユーザのユーザ属性要素をユーザＩＤ（Ｕ１）と関連付けてそれぞれ映像識別情報１１５に登録する。なお、映像識別部１０７が、ユーザを複数人と判断した後、または上記の統合処理を中止した後、所定の基準（例えば、映像の中心に最も近いユーザなど）で一のユーザを選択し、このユーザを電話端末Ａのユーザとすることもできる。 By the way, in the said integration process, when there is variation in sex, age, etc. between users, it may happen that user attribute elements become unidentifiable. In such a case, the video identification unit 107 cancels the integration process described above, and registers the user attribute element of each user in the video identification information 115 in association with the user ID (U1). In addition, after the video identification unit 107 determines that there are a plurality of users, or after the above integration process is stopped, the user is selected according to a predetermined criterion (for example, a user closest to the center of the video). This user can also be the user of telephone terminal A.

映像識別部１０７は、映像識別情報１１５への登録を終えると、主制御部１０９に電話端末Ａの映像識別処理（Ｓ６）の処理が完了したことを通知する（Ｓ８）。 When the registration to the video identification information 115 is completed, the video identification unit 107 notifies the main control unit 109 that the video identification process (S6) of the telephone terminal A has been completed (S8).

主制御部１０９は、映像識別情報１１５に登録された尤度に基づき、電話端末Ａに配信する広告コンテンツを選定し、その広告コンテンツの識別子を映像識別情報１１５のコンテンツ識別子（Ｕ１４）に登録する（Ｓ９）。ここで、主制御部１０９は、（Ｓ７）において各ユーザのユーザ属性要素が登録されている場合、ユーザごとに広告コンテンツを選定し、上記と同様に登録する。 Based on the likelihood registered in the video identification information 115, the main control unit 109 selects advertising content to be distributed to the telephone terminal A, and registers the identifier of the advertising content in the content identifier (U14) of the video identification information 115. (S9). Here, when the user attribute element of each user is registered in (S7), the main control unit 109 selects the advertisement content for each user and registers it in the same manner as described above.

次に、電話端末Ａ，Ｂのそれぞれについて上記（Ｓ１）〜（Ｓ９）の処理が完了しているとして、電話端末Ａ，Ｂ間の発着呼から、電話端末Ａ，Ｂを利用するユーザのユーザ属性要素をこのユーザの音声より推定する音声識別を経て、音声識別に基づいた広告コンテンツの選定が完了するまでを、図８に基づいて説明する。 Next, assuming that the processing of (S1) to (S9) has been completed for each of the telephone terminals A and B, the user of the user who uses the telephone terminals A and B from the incoming and outgoing calls between the telephone terminals A and B The process from the voice identification in which the attribute element is estimated based on the voice of the user until the selection of the advertisement content based on the voice identification is completed will be described with reference to FIG.

電話端末Ｂは、電話端末Ａと接続するために、電話端末Ａの電話番号或はＵＲＩ等により、電話端末Ａに対する発呼を行う（Ｓ１１）。 In order to connect to the telephone terminal A, the telephone terminal B makes a call to the telephone terminal A using the telephone number of the telephone terminal A or a URI (S11).

主制御部１０９は、電話端末Ｂから受けた電話番号或はＵＲＩ等により電話端末Ａを識別し、そのユーザ管理部１１２のユーザ情報１１４のユーザステータス（Ｕ７）に登録された情報を確認する（Ｓ１２）。 The main control unit 109 identifies the telephone terminal A by the telephone number or URI received from the telephone terminal B, and confirms the information registered in the user status (U7) of the user information 114 of the user management unit 112 ( S12).

主制御部１０９は、ユーザステータスＵ７に登録された情報が、「ログアウト」、または、「通話中」の場合を除いて、電話端末Ａの呼び出し処理を実行する（Ｓ１３）。 The main control unit 109 executes the calling process of the telephone terminal A except when the information registered in the user status U7 is “logout” or “busy” (S13).

電話端末Ａは、主制御部１０９からの呼び出しに対し応答し（Ｓ１４）、この応答を受けた主制御部１０９は、電話端末Ｂに応答通知を実行する（Ｓ１５）。 The telephone terminal A responds to the call from the main control unit 109 (S14), and the main control unit 109 that has received this response executes a response notification to the telephone terminal B (S15).

主制御部１０９は、電話端末Ａ，Ｂ間での通話を確認すると、通話開始時の料金情報として、インセンティブ情報部１１３の日付（Ｂ１０）、コールＩＤ（Ｂ１１）、発ＩＰアドレス（Ｂ１２）、発ＵＲＩ／電話番号（Ｂ１３）、着ＩＰアドレス（Ｂ１４）、着ＵＲＩ／電話番号（Ｂ１５）、および、開始時間（Ｂ１６）に登録を行い（Ｓ１６）、通信制御部１０５に電話端末Ａ，Ｂ間の通信交換を開始させる（Ｓ１７）。 When the main control unit 109 confirms the call between the telephone terminals A and B, the date (B10) of the incentive information unit 113, the call ID (B11), the calling IP address (B12), The originating URI / phone number (B13), the destination IP address (B14), the destination URI / telephone number (B15), and the start time (B16) are registered (S16), and the communication control unit 105 is connected to the telephone terminals A and B. The communication exchange is started (S17).

電話端末Ａ，Ｂ間が通話状態になると、主制御部１０９は、電話端末Ａ，Ｂに対して音声識別指示を通知し、これを受けて電話端末Ａ，Ｂは、音声案内等により音声識別用の発音を促す（Ｓ１８）。映像識別処理（Ｓ７）において、電話端末Ａのユーザが複数人と判断されている場合、主制御部１０９は、電話端末Ａに対し、代表者一人が発音するように促させる。もちろん、音声識別部１０８が電話端末Ａ，Ｂから送信された第一声を自動的に電話端末Ａ，Ｂのユーザのものとして音声識別を実行するように定めてもよい。なお、映像識別処理（Ｓ７）において、所定の基準（例えば、映像の中心に最も近いユーザ）で電話端末Ａのユーザを選ぶようにした場合には、主制御部１０９は、電話端末Ａに対し、そのユーザの発音を促させる。 When the telephone terminals A and B are in a call state, the main control unit 109 notifies the telephone terminals A and B of a voice identification instruction, and the telephone terminals A and B receive the voice identification by voice guidance or the like. Prompts for pronunciation (S18). In the video identification process (S7), when it is determined that there are a plurality of users of the telephone terminal A, the main control unit 109 prompts the telephone terminal A to pronounce one representative. Of course, the voice identification unit 108 may automatically determine that the first voice transmitted from the telephone terminals A and B belongs to the user of the telephone terminals A and B. In the video identification process (S7), when the user of the telephone terminal A is selected based on a predetermined standard (for example, the user closest to the center of the video), the main control unit 109 , To encourage the user to pronounce.

主制御部１０９は、電話端末Ａ，Ｂから送信された音声を音声識別部１０８に送信する。これを受けて、音声識別部１０８は、電話端末Ａ，Ｂのユーザ属性要素を推定する（Ｓ１９）。また、通話中、一定時間で区切った音声を電話端末Ａ，Ｂのユーザのものとしてユーザ属性要素を推定するようにしてもよい。なお、通話内容を把握することのないように選択した幾つかのキーワードで音声を区切ることもできる。例えば、音声識別部１０８に「私」と「ます」の発音を識別させ、電話端末Ａからの音声中において「私」と発音されてから「ます」が発音されるまでの音声を電話端末Ａのユーザのものとしてユーザ属性要素を推定させる。 The main control unit 109 transmits the voice transmitted from the telephone terminals A and B to the voice identification unit 108. In response, the voice identification unit 108 estimates the user attribute elements of the telephone terminals A and B (S19). In addition, during a call, the user attribute element may be estimated assuming that the voices separated by a certain time are those of the users of the telephone terminals A and B. Note that the voice can be divided by several keywords selected so as not to grasp the contents of the call. For example, the voice identification unit 108 identifies the pronunciations of “I” and “Masu”, and the voice from the phone terminal A to the word “I” is pronounced until the word “mas” is pronounced. The user attribute element is estimated as that of the user.

音声識別部１０８は、電話端末Ａ，Ｂのユーザ属性要素の推定結果（算出した尤度等）を音声識別情報１１６に作成されたユーザＩＤ（Ｕ１）のレコードに登録する（Ｓ２０）。 The voice identification unit 108 registers the estimation result (calculated likelihood etc.) of the user attribute elements of the telephone terminals A and B in the record of the user ID (U1) created in the voice identification information 116 (S20).

音声識別部１０８は、音声識別情報１１６への登録を終えると、主制御部１０９に音声識別の完了を通知する（Ｓ２１）。 After completing the registration in the voice identification information 116, the voice identification unit 108 notifies the main control unit 109 of the completion of voice identification (S21).

その後、主制御部１０９は、音声識別情報１１６に登録された尤度に基づき、電話端末Ａ，Ｂに配信する広告コンテンツを電話端末ごとに選定し、その広告コンテンツの識別子を音声識別情報１１６のコンテンツ識別子（Ｕ２４）に登録し、さらに、映像識別情報１１５および音声識別情報１１６に登録された尤度を利用して各ユーザ属性要素の上限確率を算出すると共に最尤度情報１１７の各項目に登録し、最尤度情報１１７の登録情報に応じて、電話端末Ａ，Ｂに配信する最終的な広告コンテンツを電話端末ごとに決め、最尤度情報１１７のコンテンツ識別子（Ｕ３４）に登録する（Ｓ２２）。なお、映像識別処理（Ｓ６）がスキップされている場合、主制御部１０９は、音声識別情報１１６のコンテンツ識別子（Ｕ２４）に登録された広告コンテンツの識別子を、最尤度情報１１７のコンテンツ識別子（Ｕ３４）に登録する。なお、映像識別情報１１５にコンテンツ識別子がユーザごとに登録されている場合、主制御部１０９は、上限確率の算出をスキップし、それらのコンテンツ識別子をユーザごとに登録する。 After that, the main control unit 109 selects advertisement content to be distributed to the telephone terminals A and B for each telephone terminal based on the likelihood registered in the voice identification information 116, and sets the identifier of the advertisement content in the voice identification information 116. The upper limit probability of each user attribute element is calculated using the likelihood registered in the content identifier (U24) and further registered in the video identification information 115 and the audio identification information 116, and the maximum likelihood information 117 is set in each item. The final advertisement content to be registered and distributed to the telephone terminals A and B is determined for each telephone terminal in accordance with the registration information of the maximum likelihood information 117 and registered in the content identifier (U34) of the maximum likelihood information 117 ( S22). When the video identification process (S6) is skipped, the main control unit 109 uses the advertising content identifier registered in the content identifier (U24) of the audio identification information 116 as the content identifier (maximum likelihood information 117). U34). When content identifiers are registered for each user in the video identification information 115, the main control unit 109 skips the calculation of the upper limit probability and registers these content identifiers for each user.

次に、電話端末Ａ，Ｂ間が、通話状態から保留状態になり、選定された広告コンテンツの配信が完了するまでを、図９に基づいて説明する。以降の説明では、電話端末Ａのコンテンツ識別子（Ｕ３４）には、広告コンテンツＡの“Ａ”が登録され、電話端末Ｂのコンテンツ識別子（Ｕ３４）には、広告コンテンツＢの“Ｂ”が登録されていると仮定する。 Next, a description will be given based on FIG. 9 until the telephone terminals A and B change from the call state to the hold state and the distribution of the selected advertisement content is completed. In the following description, “A” of the advertising content A is registered in the content identifier (U34) of the telephone terminal A, and “B” of the advertising content B is registered in the content identifier (U34) of the telephone terminal B. Assuming that

電話端末Ａ，Ｂ間が通話状態（Ｓ１７）の時に、電話端末Ｂを利用するユーザが電話端末Ｂに保留操作を行い、電話端末Ｂから電話保留要求が主制御部１０９に通知される（Ｓ３１）。 When the telephone terminals A and B are in a call state (S17), a user using the telephone terminal B performs a holding operation on the telephone terminal B, and a telephone holding request is notified from the telephone terminal B to the main control unit 109 (S31). ).

これを受けて、主制御部１０９は、電話端末Ａに対する電話保留指示を通知すると共に（Ｓ３２）、ユーザ管理部１１２のユーザ情報１１４を検索し、電話端末Ｂの配信要求（Ｕ９）を閲覧して広告コンテンツの配信要求の有無を確認する（Ｓ３３）。 In response to this, the main control unit 109 notifies the telephone terminal A of a telephone hold instruction (S32), searches the user information 114 of the user management unit 112, and browses the distribution request (U9) of the telephone terminal B. The presence / absence of the advertisement content distribution request is confirmed (S33).

主制御部１０９は、電話端末Ｂが広告コンテンツの配信要求を有することを確認すると、コンテンツ制御部１１０に、電話端末Ａの最尤度情報１１７のコンテンツ識別子（Ｕ３４）に登録したコンテンツ識別子“Ａ”と、このコンテンツ識別子を有する広告コンテンツの配信指示（コンテンツ指示）を通知する（Ｓ３４）。 When the main control unit 109 confirms that the telephone terminal B has the advertisement content distribution request, the content identifier “A” registered in the content identifier (U34) of the maximum likelihood information 117 of the telephone terminal A is registered in the content control unit 110. ”And an advertisement content distribution instruction (content instruction) having this content identifier is notified (S34).

これを受けて、コンテンツ制御部１１０は、広告コンテンツ保持装置１０４から広告コンテンツＡを取得し（Ｓ３５）、ミキシング部１１１に送る（Ｓ３６）。なお、コンテンツ制御部１１０のＤＢ内に広告コンテンツＡが蓄積されている場合には、（Ｓ３５）がスキップされる。 In response to this, the content control unit 110 acquires the advertising content A from the advertising content holding device 104 (S35) and sends it to the mixing unit 111 (S36). Note that if the advertising content A is stored in the DB of the content control unit 110, (S35) is skipped.

ミキシング部１１１は、受け取った広告コンテンツＡの再生処理を行い、再生処理が完了すると、主制御部１０９に再生準備の完了を通知する（Ｓ３７）。 The mixing unit 111 performs the reproduction process of the received advertisement content A, and when the reproduction process is completed, the mixing unit 111 notifies the main control unit 109 of the completion of reproduction preparation (S37).

これを受けて、主制御部１０９は、電話端末Ａに対してコンテンツ再生指示を通知する（Ｓ３８）と共に、インセンティブ情報部１１３のコンテンツ識別子（Ｂ１８）や、コンテンツ再生開始時間（Ｂ１９）等、コンテンツ再生に伴う情報の登録処理を順次開始していく（Ｓ３９）。 In response to this, the main control unit 109 notifies the content reproduction instruction to the telephone terminal A (S38), and the content identifier (B18) of the incentive information unit 113, the content reproduction start time (B19) and the like. Registration processing of information accompanying reproduction is started sequentially (S39).

コンテンツ再生指示を受けて、電話端末Ａは、ミキシング部１１１との間においてコンテンツ再生状態となり、電話端末ＡがコンテンツＡを出力し、広告コンテンツの配信が完了する（Ｓ４０）。なお、最尤度情報１１７にコンテンツ識別子がユーザごとに登録されている場合、それらのコンテンツ識別子の広告コンテンツが、順次、電話端末Ａに配信される。また、電話端末Ａが保留要求を出した場合には、上記と同様にして電話端末ＢにコンテンツＢが配信される。 Upon receiving the content reproduction instruction, the telephone terminal A enters a content reproduction state with the mixing unit 111, the telephone terminal A outputs the content A, and the distribution of the advertising content is completed (S40). If content identifiers are registered in the maximum likelihood information 117 for each user, the advertising content with those content identifiers is sequentially delivered to the telephone terminal A. When telephone terminal A issues a hold request, content B is distributed to telephone terminal B in the same manner as described above.

次に、電話端末Ａ，Ｂ間が保留状態から切断状態に至るまでを、図１０に基づいて説明する。 Next, the process from the hold state to the disconnected state between the telephone terminals A and B will be described with reference to FIG.

電話端末Ａとミキシング部１１１との間においてコンテンツ再生状態が確立されている状態で（Ｓ４０）、電話端末Ｂから保留解除要求が主制御部１０９に通知される（Ｓ４１）。 While the content playback state is established between the telephone terminal A and the mixing unit 111 (S40), a hold release request is notified from the telephone terminal B to the main control unit 109 (S41).

これを受けて、主制御部１０９は、インセンティブ情報部１１３のコンテンツ再生終了時間（Ｂ２０）に情報を登録すると共に（Ｓ４２）、電話端末Ａに保留解除指示を通知する（Ｓ４３）。 In response, the main control unit 109 registers information at the content reproduction end time (B20) of the incentive information unit 113 (S42), and notifies the telephone terminal A of a hold release instruction (S43).

これを受けて、電話端末Ａは、ミキシング部１１１との通信を終了し、電話端末Ｂとの通話状態を再確立する（Ｓ４４）。 Receiving this, the telephone terminal A ends the communication with the mixing unit 111 and re-establishes the communication state with the telephone terminal B (S44).

電話端末Ｂは、電話端末Ａとの通話終了に伴い主制御部１０９に対して切断要求を通知する（Ｓ４５）。 The telephone terminal B notifies the main control unit 109 of a disconnection request when the call with the telephone terminal A ends (S45).

これを受けて、主制御部１０９は、インセンティブ情報部１１３の終了時間（Ｂ１７）に情報を登録し、電話端末Ａ，Ｂ間の切断が完了する（Ｓ４６）。 In response, the main control unit 109 registers information at the end time (B17) of the incentive information unit 113, and the disconnection between the telephone terminals A and B is completed (S46).

なお、実施形態に係る電話システムは、上述の広告コンテンツ配信方法の処理を、電話システム管理装置を構成するコンピュータに実行させる広告コンテンツ配信プログラムにより実現することができる。また、この広告コンテンツ配信プログラムは、コンピュータ読み取り可能な記録媒体に記録することができる。 Note that the telephone system according to the embodiment can be realized by an advertising content distribution program that causes a computer constituting the telephone system management apparatus to execute the processing of the above-described advertising content distribution method. The advertisement content distribution program can be recorded on a computer-readable recording medium.

また、実施形態では、映像識別部１０７および音声識別部１０８に複数種類の識別方式を実行する機能を持たせ、或は、映像識別部１０７および音声識別部１０８をそれぞれ異なる識別方式を採用した複数の識別部から構成し、推定するユーザ属性要素の種類に応じて、好適な識別方式、或は識別部を選択するようにすることもできる。これにより、各識別部の推定結果と現実のユーザのユーザ属性要素との一致性がより向上する。例えば、映像識別部１０７を、性別識別に特化した識別方式の識別部、年代識別に特化した識別方式の識別部から構成し、電話端末Ａ，Ｂから送信された映像・音声を並列的に各識別部で識別すればよい。 In the embodiment, the video identification unit 107 and the audio identification unit 108 have a function of executing a plurality of types of identification methods, or the video identification unit 107 and the audio identification unit 108 are a plurality of different identification methods. It is also possible to select a suitable identification method or identification unit according to the type of user attribute element to be estimated. Thereby, the consistency between the estimation result of each identification unit and the user attribute element of the actual user is further improved. For example, the video discriminating unit 107 is composed of a discriminating unit discriminating unit specialized in gender discrimination and an discriminating unit discriminating unit specializing in age discrimination, and the video / audio transmitted from the telephone terminals A and B are processed in parallel. Each of the identification units may be identified.

上述のように、この実施形態では、電話端末Ａ，Ｂを利用するユーザに広告コンテンツの配信によるインセンティブ広告サービスを提供するのに必要な情報が、インセンティブ情報部１１３に記録される。これにより、例えば、広告対象者の性別や年代別に、何人かに一人の割合で、割引サービスや無料サービス、その他の特典等を提供することが可能になる。 As described above, in this embodiment, information necessary for providing an incentive advertising service by distributing advertising content to users who use the telephone terminals A and B is recorded in the incentive information unit 113. Thereby, for example, it becomes possible to provide a discount service, a free service, other benefits, etc. at a rate of one person per person according to the sex and age of the advertising target person.

また、この実施形態では、選定された広告コンテンツごとにミキシング部１１１において再生処理が行われるため、電話端末ごとに広告コンテンツを配信することが可能になる。 Further, in this embodiment, since the reproduction processing is performed in the mixing unit 111 for each selected advertisement content, the advertisement content can be distributed for each telephone terminal.

また、この実施形態では、音声識別部１０８により性別や年代といったユーザ属性要素を推定するため、通話中の通話内容に立ち入った分析が不要になり、ユーザのプライバシーを保護することができる。また、この実施形態では、音声識別や映像識別により国籍や人種のユーザ属性要素を推定することができるので、国際空港に設置した公衆電話端末等を介して、国際的にインセンティブ広告サービスを提供することが可能になる。 Further, in this embodiment, since the user attribute elements such as gender and age are estimated by the voice identification unit 108, analysis into the contents of a call during a call is not necessary, and user privacy can be protected. Also, in this embodiment, user attribute elements of nationality and race can be estimated by voice identification and video identification, so an incentive advertising service is provided internationally via a public telephone terminal installed at an international airport It becomes possible to do.

また、この実施形態では、映像識別処理や音声識別処理により広告コンテンツが選定されるため、ユーザ情報１１４にユーザ属性要素を登録する必要がなくなり、不特定の者が利用する公衆電話端末等においても同様のインセンティブ広告サービスを提供することが可能になる。無論、家庭用テレビ電話端末の場合、ユーザがある程度固定されているので、家族構成人ごとに個人情報やユーザ属性要素をユーザ情報１１４に予め登録しておくと、映像識別処理や音声識別処理と組合せることで、より効果的な広告コンテンツの配信が可能になる。 Further, in this embodiment, since the advertising content is selected by the video identification process or the voice identification process, it is not necessary to register the user attribute element in the user information 114, and even in a public telephone terminal used by an unspecified person. It is possible to provide a similar incentive advertising service. Of course, in the case of a home videophone terminal, since the user is fixed to some extent, if personal information and user attribute elements are registered in advance in the user information 114 for each family member, video identification processing and voice identification processing are performed. By combining them, it is possible to distribute advertising content more effectively.

また、この実施形態に係る電話システムに公衆電話端末を接続すると、公衆電話端末のユーザが広告対象者に該当する場合に、何人かに一人の割合で特典を付けるといったサービスを提供することができる。具体的には、広告コンテンツの広告対象者の属性要素が（性別）“女性”、（年代）“１０代”に設定されている場合において、これらに該当するユーザ属性要素を有すると推定された公衆電話端末のユーザのうち、抽選による当選者１００名に対してプレゼントを提供するといった内容である。この場合、インセンティブ情報部１１３に当選者の個人情報入力部を設け、電話終了後に、公衆電話端末に設けた表示画面より、プレゼントサービスに当選した旨のメッセージと、ユーザにプレゼントの提供に必要な個人情報を登録して貰うための入力案内を出力させるようにすればよい。無論、抽選によらず、もれなくプレゼントを提供するようにしてもよい。 In addition, when a public telephone terminal is connected to the telephone system according to this embodiment, a service can be provided in which a privilege is given to one person at a rate of one person when a user of the public telephone terminal falls under an advertising target. . Specifically, when the attribute element of the advertisement target person of the advertisement content is set to (sex) “female” and (age) “teens”, it is estimated that the user attribute element corresponding to these is included. The content is such that a present is provided to 100 winners by lottery among users of public telephone terminals. In this case, the personal information input unit of the winner is provided in the incentive information unit 113, and a message indicating that the present service has been won from the display screen provided on the public telephone terminal after the end of the call and necessary for providing the user with a present What is necessary is just to make it output the input guidance for registering and receiving personal information. Of course, presents may be provided without exception regardless of lottery.

上述のように、この実施形態に係る電話システムに公衆電話端末を接続してインセンティブ広告サービスを実施すれば、従来のように、広告用サンプルやプレゼントの配布する者を街中に立たせ、この者の主観によって通行人が広告対象者に該当するか判断させた上で、広告用サンプル等を配布する方法と比べると、同様の広告効果を得られることに加えて、人件費を大幅に低減することが可能になり、同時に、広告対象者の個人情報収集を行うことも可能になる。 As described above, if a public telephone terminal is connected to the telephone system according to this embodiment and the incentive advertisement service is performed, the person who distributes the advertisement sample and the present is made to stand in the town as in the past, and the person's Compared to the method of distributing advertising samples after judging whether passersby are subject to advertising by subjectivity, in addition to obtaining the same advertising effect, greatly reducing labor costs At the same time, it becomes possible to collect personal information of the advertising target person.

実施形態に係る電話システムおよび電話システム管理装置の概略全体構成を概念的に示したブロック図。The block diagram which showed notionally the schematic whole structure of the telephone system which concerns on embodiment, and a telephone system management apparatus. 電話システム管理装置のユーザ管理部内に設けられるユーザ情報のデータ構造を示す概念図。The conceptual diagram which shows the data structure of the user information provided in the user management part of a telephone system management apparatus. 同映像識別情報のデータ構造を示す概念図。The conceptual diagram which shows the data structure of the video identification information. 同音声識別情報のデータ構造を示す概念図。The conceptual diagram which shows the data structure of the audio | voice identification information. 同最尤度情報のデータ構造を示す概念図。The conceptual diagram which shows the data structure of the maximum likelihood information. 同インセンティブ情報部のデータ構造を示す概念図。The conceptual diagram which shows the data structure of the incentive information part. 実施形態に係る電話システムを用いた広告コンテンツ配信方法の流れのうち、電話端末のログインから映像識別処理の完了までを示す部分シーケンス図。The partial sequence figure which shows from the login of a telephone terminal to the completion of a video identification process among the flows of the advertising content delivery method using the telephone system which concerns on embodiment. 同電話端末の発呼から広告コンテンツ選定処理の完了までを示す部分シーケンス図。The partial sequence diagram which shows from the call of the same telephone terminal to the completion of the advertisement content selection processing. 同電話端末間の保留処理から広告コンテンツ配信処理の完了までを示す部分シーケンス図。The partial sequence diagram which shows from the hold process between the telephone terminals to completion of an advertisement content delivery process. 同電話端末間の保留解除処理から切断処理の完了までを示す部分シーケンス図。The partial sequence figure which shows from the cancellation | release hold process between the telephone terminals to the completion of a cutting process.

Explanation of symbols

１０１電話システム管理装置
１０２通信網
１０３ＩＰ網
１０４広告コンテンツ保持装置
１０５通信制御部
１０６ユーザ認証部
１０７映像識別部
１０８音声識別部
１０９主制御部
１１０コンテンツ制御部
１１１ミキシング部
１１２ユーザ管理部
１１３インセンティブ情報部
１１４ユーザ情報
１１５映像識別情報
１１６音声識別情報
１１７最尤度情報 101 telephone system management device 102 communication network 103 IP network 104 advertising content holding device 105 communication control unit 106 user authentication unit 107 video identification unit 108 audio identification unit 109 main control unit 110 content control unit 111 mixing unit 112 user management unit 113 incentive information Unit 114 user information 115 video identification information 116 voice identification information 117 maximum likelihood information

Claims

A plurality of telephone terminals; a telephone system management apparatus that controls communication between these telephone terminals; and a communication network that connects the telephone system management apparatus and the telephone terminal. The telephone system management apparatus is registered in advance. In the telephone system that selects any one of the plurality of advertisement contents and distributes the selected advertisement contents to the telephone terminal that is using the telephone content, the telephone system management device is configured to A telephone system comprising: a voice identification unit that estimates a user attribute element from the voice of the user; and selecting the advertisement content according to the user attribute element estimated by the voice identification unit.

A plurality of telephone terminals; a telephone system management apparatus that controls communication between these telephone terminals; and a communication network that connects the telephone system management apparatus and the telephone terminal. The telephone system management apparatus is registered in advance. In the telephone system that selects any one of the plurality of advertisement contents and distributes the selected advertisement contents to the telephone terminal that is using the telephone contents, the telephone system management device transfers the telephone terminal from the telephone terminal. When a video showing a user to be used is transmitted, a video identification unit that estimates a user attribute element of the user from the video is provided, and the advertisement content is displayed according to the user attribute element estimated by the video identification unit. A telephone system characterized by selection.

The telephone system management device includes a voice identification unit that estimates a user attribute element of a user who uses the telephone terminal from the voice of the user, and uses the voice identification unit and the video identification unit together to determine the user attribute element. The telephone system according to claim 2, wherein the telephone system is estimated.

The telephone system management device acquires the advertising content from an advertising content holding device connected through a network including at least one of an IP network and a PSTN. The phone system described in Crab.

The telephone system management device performs any one of the estimation process among the estimation process by the voice identification unit, the estimation process by the video identification unit, and the estimation process by using the voice identification unit and the video identification unit in combination. 5. The telephone system according to claim 3, wherein the telephone system is selected and executed according to a terminal environment of the terminal.

The said telephone system management apparatus estimates the said user attribute element for every said telephone terminal which carries out communication exchange, and selects the said advertisement content for each of these telephone terminals, It is any one of Claim 1 to 5 characterized by the above-mentioned. Phone system.

A telephone that controls communication between telephone terminals connected to a communication network, selects any advertising content from a plurality of pre-registered advertising contents, and distributes the selected advertising content to the telephone terminals in use In the system management device, a voice identification unit that estimates a user attribute element of a user who uses the telephone terminal from the voice of the user is selected, and the advertisement content is selected according to the user attribute element estimated by the voice identification unit A telephone system management device.

A telephone that controls communication between telephone terminals connected to a communication network, selects any advertising content from a plurality of pre-registered advertising contents, and distributes the selected advertising content to the telephone terminals in use The system management apparatus includes a video identification unit that estimates a user attribute element of the user from the video when the video that shows the user using the telephone terminal is transmitted from the telephone terminal, and the video identification unit estimates The telephone system management apparatus, wherein the advertisement content is selected according to the user attribute element that has been set.

A voice identification unit that estimates a user attribute element of a user who uses the telephone terminal from the voice of the user, and the user attribute element is estimated by using the voice identification unit and the video identification unit in combination. The telephone system management apparatus according to claim 8.

Any one of the estimation process by the voice identification unit, the estimation process by the video identification unit, and the estimation process by the combined use of the voice identification unit and the video identification unit is performed according to the terminal environment of the telephone terminal. The telephone system management apparatus according to claim 9, wherein the telephone system management apparatus is selected and executed.

The telephone system management apparatus according to any one of claims 7 to 10, wherein the user attribute element is estimated for each telephone terminal that performs communication exchange, and the advertisement content is selected for each telephone terminal.

A plurality of telephone terminals; a telephone system management apparatus that controls communication between these telephone terminals; and a communication network that connects the telephone system management apparatus and the telephone terminal. The telephone system management apparatus is registered in advance. In the advertising content distribution method using the telephone system, the method of selecting any one of the plurality of advertising contents and the step of distributing the selected advertising content to the telephone terminal that is in use, The telephone system management device transmits a voice identification step for estimating a user attribute element of a user who uses the telephone terminal from the voice of the user, and a video showing the user who uses the telephone terminal from the telephone terminal. A video identification step for estimating the user attribute element of the user from the video, and the voice identification. Serial of the steps of estimating the user attribute element in combination with image identification, one of the steps, advertisement content distribution method, characterized in that selects and executes in response to the terminal environment of the telephone terminal.

An advertisement content distribution program for causing a telephone system to execute the advertisement content distribution method according to claim 12.

A computer-readable recording medium on which the advertising content distribution program according to claim 13 is recorded.