JP2017191390A

JP2017191390A - Communication system, communication log collection system, server, and communication method

Info

Publication number: JP2017191390A
Application number: JP2016079220A
Authority: JP
Inventors: 義博中橋; Yoshihiro Nakahashi; 貴史鹿田; Takashi Shikata; 尋満山内; Hiromitsu Yamauchi
Original assignee: Robot Start Inc
Current assignee: Robot Start Inc
Priority date: 2016-04-12
Filing date: 2016-04-12
Publication date: 2017-10-19

Abstract

PROBLEM TO BE SOLVED: To increase the quality of communications between a user and a robot.SOLUTION: The present invention relates to a communication system having a robot on a user side and a server. The robot includes: a microphone for collecting sounds of a user; sending/receiving means for sending a sound signal of a sound of a user collected by the microphone to the server via a network and receiving a sound signal of the user sent via the network; sound editing means for editing the sound signal of the user sent via the network as a speech of the robot and generating an edited sound signal; and at least one speaker for outputting the edited sound signal. The server has connection management means for managing the sending and receiving of signals of the robot.SELECTED DRAWING: Figure 1

Description

本発明は、コミュニケーションシステム、会話ログ収集システム、サーバ及びコミュニケーション方法に関する。 The present invention relates to a communication system, a conversation log collection system, a server, and a communication method.

近年、人とロボット（例えば、人型ロボット）との対話を成立させるコミュニケーションシステムが提案されている。 In recent years, a communication system that establishes a dialogue between a person and a robot (for example, a humanoid robot) has been proposed.

その一つは、タスク指向型とよばれるもので、特定のタスクをロボットに行わせるための対話システムである。例えば「今日の天気を教えて」といった、ユーザである人の発話（命令）に対して、ロボットは今日の天気予報を音声で伝える。これらの命令と回答のセットは、予め一意に辞書に登録されている。 One of them is called a task-oriented type, which is an interactive system that allows a robot to perform a specific task. For example, in response to an utterance (command) of a person who is a user such as “tell me today's weather”, the robot conveys today's weather forecast by voice. A set of these commands and answers is uniquely registered in advance in the dictionary.

もう一つは、雑談型と呼ばれるもので、ロボットに特有のタスクをさせるというより、ユーザがロボットとの会話を楽しむためのシステムである（非特許文献１）。これはchatbot（人工無能）対話システムを応用している。このchatbot対話システムは、ユーザと日常会話を行なうためのシステムであり、大きく分けて、辞書型（シナリオ型）、ログ型、マルコフ文生成型（テキスト生成型）等がある。その基本は所定の対話パターンをデータベース化しておき、対話時の入力内容に応じて相応しい応答内容を検索し、それをシステム側から出力する点にある。例えば、対話システムに対してユーザが「何が好きですか？」とキーボードやマイク等を通じて入力すると、システム側は「何−が−好き−です−か？」といった単語列に最も合致する応答データを検索する。データベースには予め入力例とそれに対応する応答文とが大量に格納されている。対話システムは検索結果によって選ばれた応答文を取り出し、それをスピーカやモニターを介してユーザに対して出力する。データベース中の応答内容の格納方法を工夫することで、ユーザの入力の一部を応答文に挿入することもできる。 The other is called a chat type, which is a system that allows a user to enjoy a conversation with a robot rather than letting a task unique to the robot (Non-Patent Document 1). This applies a chatbot (artificial incompetence) dialogue system. This chatbot dialogue system is a system for carrying out daily conversation with a user, and is roughly classified into a dictionary type (scenario type), a log type, a Markov sentence generation type (text generation type), and the like. The basic point is that a predetermined dialogue pattern is stored in a database, a suitable response content is searched according to the input content at the time of dialogue, and it is output from the system side. For example, when the user inputs “What do you like?” To the interactive system through a keyboard, microphone, etc., the system side will respond most closely to a word string such as “What is it? Search for. A large number of input examples and corresponding response sentences are stored in the database in advance. The dialogue system takes out a response sentence selected according to the search result and outputs it to the user via a speaker or a monitor. By devising a method for storing response contents in the database, it is possible to insert part of the user's input into the response sentence.

[Valerie] Valerie Web Site : http://www.roboceptionist.com/[Valerie] Valerie Web Site: http://www.roboceptionist.com/

しかしながら、現在の所、辞書型（シナリオ型）、ログ型、マルコフ文生成型（テキスト生成型）等のいずれの方法も完全とはいえず、人間とロボットとの会話が成立しない場合が多々ある。 However, at present, none of the methods such as dictionary type (scenario type), log type, Markov sentence generation type (text generation type) are perfect, and there are many cases where a conversation between a human and a robot is not established. .

一方、ロボットのユーザは、ロボットとの間で、ある程度完成されたコミュニケーションを望んでいる。 On the other hand, the user of the robot desires communication completed to some extent with the robot.

そこで、本発明は上記課題に鑑みて発明されたものであって、その目的はユーザとロボットとの間で行われる会話の品質を高めることができるコミュニケーションシステム、会話ログ収集システム、サーバ及びコミュニケーション方法を提供することにある。 Therefore, the present invention has been invented in view of the above problems, and its purpose is a communication system, a conversation log collection system, a server, and a communication method that can improve the quality of conversation performed between a user and a robot. Is to provide.

本発明の一態様は、コミュニケーションシステムであって、ユーザ側に設置されるロボットと、サーバとを有し、前記ロボットは、ユーザの音声を集音するマイクと、前記マイクで集音されたユーザの音声の音声信号を、ネットワークを通じて前記サーバに送信し、前記ネットワークを通じて送られてくるユーザの音声信号を受信する送受信手段と、ネットワークを通じて送られてくるユーザの音声信号を、前記ロボットの発話として編集して、編集音声信号を生成する音声編集手段と、前記編集音声信号を出力する少なくとも一以上のスピーカと、を有し、前記サーバは、前記ロボットの信号の送受信を管理する接続管理手段を有するコミュニケーションシステムである。 One aspect of the present invention is a communication system, which includes a robot installed on a user side and a server, and the robot collects a microphone that collects a user's voice and a user who collects sound using the microphone. The voice signal of the user is transmitted to the server through the network, and the user's voice signal transmitted through the network is transmitted and received as the utterance of the robot. A voice editing unit that edits and generates an edited voice signal; and at least one speaker that outputs the edited voice signal; and the server includes a connection management unit that manages transmission and reception of signals of the robot. It has a communication system.

本発明の一態様は、コミュニケーションシステムにおける会話ログ収集システムであって、ユーザ側に設置されるロボットと、サーバとを有し、前記ロボットは、ユーザの音声を集音するマイクと、前記マイクで集音されたユーザの音声の音声信号を、ネットワークを通じて前記サーバに送信し、前記ネットワークを通じて送られてくるユーザの音声信号を受信する送受信手段と、ネットワークを通じて送られてくるユーザの音声信号を、前記ロボットの発話として編集して、編集音声信号を生成する音声編集手段と、前記編集音声信号を出力する少なくとも一以上のスピーカと、を有し、前記サーバは、前記ロボットの信号の送受信を管理する接続管理手段と、ネットワークを通じて送られてくるユーザの音声信号を音声認識し、テキスト化する音声認識手段と、前記ユーザの音声がテキスト化された文字列を、会話ログとして収集する会話ログ収集手段とを有する会話ログ収集システムである。 One aspect of the present invention is a conversation log collection system in a communication system, which includes a robot installed on a user side and a server, and the robot includes a microphone that collects a user's voice and the microphone. The voice signal of the collected user's voice is transmitted to the server through a network, and the voice signal of the user sent through the network is transmitted / received to receive the user's voice signal sent through the network, The server includes voice editing means that edits the speech of the robot and generates an edited voice signal, and at least one speaker that outputs the edited voice signal, and the server manages transmission and reception of the robot signal. A connection management means that recognizes the user's voice signal sent through the network A speech recognition means for the character string voice is text of the user, a conversation log collecting system and a conversation log collecting means for collecting a conversation log.

本発明の一態様は、会話ログ収集サーバであって、ユーザの音声を集音するマイクと、前記マイクで集音されたユーザの音声の音声信号を、ネットワークを通じて前記サーバに送信し、前記ネットワークを通じて送られてくるユーザの音声信号を受信する送受信手段と、ネットワークを通じて送られてくるユーザの音声信号を、前記ロボットの発話として編集して、編集音声信号を生成する音声編集手段と、前記編集音声信号を出力する少なくとも一以上のスピーカと、を有するロボットの間の信号の送受信を管理する接続管理手段と、ネットワークを通じて送られてくるユーザの音声信号を音声認識し、テキスト化する音声認識手段と、前記ユーザの音声がテキスト化された文字列を、会話ログとして収集する会話ログ収集手段とを有する会話ログ収集サーバである。 One aspect of the present invention is a conversation log collection server, wherein a microphone that collects a user's voice and a voice signal of the user's voice collected by the microphone are transmitted to the server through a network, and the network Transmitting / receiving means for receiving a user's voice signal sent through the network; voice editing means for editing the user's voice signal sent through the network as an utterance of the robot and generating an edited voice signal; and the editing Connection management means for managing transmission / reception of signals between robots having at least one speaker for outputting voice signals, and voice recognition means for voice recognition of user voice signals sent through a network and converting them into text And a conversation log collecting means for collecting, as a conversation log, a character string in which the user's voice is converted into text. It is a story log collection server.

本発明の一態様は、コミュニケーション方法であって、発話するユーザ側に設置されるロボットは、マイクにより、ユーザの音声を集音し、前記発話するユーザ側に設置されるロボットは、前記マイクで集音されたユーザの音声の音声信号を、ネットワークを通じて、サーバに送信し、前記サーバは、受信したユーザの音声の音声信号を、前記着話先のロボットに送信し、前記着話先のロボットは、前記ネットワークを通じて送られてくるユーザの音声信号を受信し、前記着話先のロボットは、前記ユーザの音声信号を、前記ロボットの発話として編集して、編集音声信号を生成し、前記着話先のロボットは、前記編集音声信号を出力するコミュニケーション方法である。 One embodiment of the present invention is a communication method, in which a robot installed on a uttering user side collects a user's voice using a microphone, and the robot installed on the uttering user side uses the microphone. The collected voice signal of the user's voice is transmitted to the server via the network, and the server transmits the received voice signal of the user's voice to the destination robot, and the destination robot Receives a voice signal of a user sent through the network, and the destination robot edits the voice signal of the user as an utterance of the robot, generates an edited voice signal, and The talking robot is a communication method for outputting the edited voice signal.

本発明は、ユーザとロボットとの間で行われる会話の品質を高めることができる。 The present invention can improve the quality of a conversation performed between a user and a robot.

図１は本発明の第１の実施の形態に係るコミュニケーションロボットシステムを模式的に示した図である。FIG. 1 is a diagram schematically showing a communication robot system according to a first embodiment of the present invention. 図２はロボット１の構成を示すブロック図である。FIG. 2 is a block diagram showing the configuration of the robot 1. 図３は第１の実施の形態におけるサーバ３のブロック図である。FIG. 3 is a block diagram of the server 3 according to the first embodiment. 図４は接続管理データベース３２の一例を示す図である。FIG. 4 is a diagram showing an example of the connection management database 32. 図５はサーバ３の変形例を示した図である。FIG. 5 is a diagram showing a modification of the server 3. 図６は第１の実施の形態の変形例における接続管理データベース３２の一例を示す図である。FIG. 6 is a diagram illustrating an example of the connection management database 32 according to the modification of the first embodiment. 図７は第２の実施の形態におけるサーバ３のブロック図である。FIG. 7 is a block diagram of the server 3 in the second embodiment. 図８は第３の実施の形態におけるサーバ３のブロック図である。FIG. 8 is a block diagram of the server 3 according to the third embodiment.

＜第１の実施の形態＞
本発明の第１の実施の形態を説明する。 <First Embodiment>
A first embodiment of the present invention will be described.

図１は、本発明の第１の実施の形態に係るコミュニケーションロボットシステムを模式的に示した図である。 FIG. 1 is a diagram schematically showing a communication robot system according to a first embodiment of the present invention.

図１中、１はユーザＡ側に設置されるロボットであり、２はユーザＢ側に設置されるロボットであり、３はロボット１とロボット２とを接続されたサーバである。 In FIG. 1, 1 is a robot installed on the user A side, 2 is a robot installed on the user B side, and 3 is a server to which the robot 1 and the robot 2 are connected.

ロボット１とロボット２とは、同様のものなので、ロボット１を例にしてロボットの構成を説明する。 Since the robot 1 and the robot 2 are the same, the configuration of the robot will be described using the robot 1 as an example.

図２はロボット１の構成を示すブロック図である。 FIG. 2 is a block diagram showing the configuration of the robot 1.

図２に示す如く、ロボット１は、マイク１１と、音声編集部１２と、スピーカ１３と、制御部１４とを有する。 As shown in FIG. 2, the robot 1 includes a microphone 11, a voice editing unit 12, a speaker 13, and a control unit 14.

マイク１１は、ユーザＡの音声を集音するマイクである。 The microphone 11 is a microphone that collects the voice of the user A.

音声編集部１２は、ネットワークを通じてサーバ３から送られてくるユーザＢの音声信号を、ロボット１の発話として編集して、編集音声信号を生成するものである。ここで、ユーザＢの音声信号をロボット１の発話として編集するとは、ユーザＢの音声信号に対して、ユーザＢの音声（音色や声色）をロボット１の音声（音色や声色）に編集（変換）するものである。例えば、男性又は女性のユーザの音声を、ロボット特有の中性の音声に編集（変換）したり、ユーザのカスタマイズによるロボットの音声に編集（変換）したりする。 The voice editing unit 12 edits the voice signal of the user B transmitted from the server 3 through the network as an utterance of the robot 1 and generates an edited voice signal. Here, editing the user B's voice signal as the utterance of the robot 1 means that the user B's voice (timbre or voice color) is edited (converted) into the robot 1 voice (tone or voice color) with respect to the user B voice signal. ) For example, the voice of a male or female user is edited (converted) into neutral voice unique to the robot, or edited (converted) into the voice of the robot customized by the user.

スピーカ１３は、音声編集部１２により編集（変換）された編集音声信号を出力する少なくとも一以上のスピーカである。 The speaker 13 is at least one speaker that outputs an edited audio signal edited (converted) by the audio editing unit 12.

図３は第１の実施の形態におけるサーバ３のブロック図である。 FIG. 3 is a block diagram of the server 3 according to the first embodiment.

サーバ３は、ロボット間接続管理部３１と、接続管理データベース３２とを有する。 The server 3 includes an inter-robot connection management unit 31 and a connection management database 32.

接続管理データベース３２は、図４に示す如く、ロボット識別情報（ＩＤ）と、接続状況（接続中又は切断中）と、接続先のロボット識別情報（ＩＤ）とが関連付けられて記憶される。 As shown in FIG. 4, the connection management database 32 stores robot identification information (ID), connection status (connected or disconnected), and connected robot identification information (ID) in association with each other.

そして、ロボット間接続管理部３１は、接続管理データベース３２を用いて、ロボット間、本例では、ロボット１とロボット２との接続を管理する。 The inter-robot connection management unit 31 uses the connection management database 32 to manage the connection between the robots, in this example, the robot 1 and the robot 2.

次に、本実施の形態の動作を説明する。尚、ロボット１とロボット２との通信の接続は、サーバ３のロボット間接続管理部３１により、接続管理されているものとする。 Next, the operation of the present embodiment will be described. Note that the communication connection between the robot 1 and the robot 2 is managed by the inter-robot connection management unit 31 of the server 3.

まず、ユーザＡは、ロボット１に向かって話しかける。そのユーザＡの音声はマイク１１で集音され、その音声信号がサーバ３に送信される。例えば、ユーザＡが「アップルパイを作って食べるよ。」と話しかけると、「アップルパイを作って食べるよ。」の音声信号がサーバ３に送信される。 First, the user A speaks toward the robot 1. The voice of the user A is collected by the microphone 11 and the voice signal is transmitted to the server 3. For example, when the user A talks to “I will make and eat an apple pie”, a voice signal “I will make and eat an apple pie” is transmitted to the server 3.

サーバ３のロボット間接続管理部３１は、ロボット１から送信されてきた音声信号を、接続管理データベース３２を参照し、ロボット２に送信する。 The inter-robot connection management unit 31 of the server 3 refers to the connection management database 32 and transmits the audio signal transmitted from the robot 1 to the robot 2.

ロボット２では、音声編集部１２により、受信した音声信号をロボット２の発話として編集して、編集音声信号を生成する。例えば、受信した「アップルパイを作って食べるよ。」の音声信号を、ロボット２の発話として編集して、編集音声信号を生成する。そして、編集音声信号は、スピーカ１３から出力される。例えば、ロボット２の特有の音声で、「アップルパイを作って食べるよ。」が出力される。 In the robot 2, the voice editing unit 12 edits the received voice signal as an utterance of the robot 2 to generate an edited voice signal. For example, the received voice signal “I will make and eat an apple pie” is edited as the utterance of the robot 2 to generate an edited voice signal. Then, the edited audio signal is output from the speaker 13. For example, “I will make and eat an apple pie” is output with the voice specific to the robot 2.

ここで、ユーザＢは、ロボット２が発した音声に返答して、ロボット２に向かって話しかける。そのユーザＢの音声はマイク１１で集音され、その音声信号がサーバ３に送信される。例えば、ユーザＢが「アップルパイを作って食べるよ。」に対して、「いいね。食べたい。」と返答した場合、「いいね。食べたい。」の音声信号がサーバ３に送信される。 Here, the user B responds to the voice uttered by the robot 2 and speaks toward the robot 2. The voice of the user B is collected by the microphone 11 and the voice signal is transmitted to the server 3. For example, when the user B replies “I want to eat” to “I will make and eat an apple pie”, a voice signal “I want to eat” is transmitted to the server 3. .

サーバ３のロボット間接続管理部３１は、ロボット２から送信されてきた音声信号を、接続管理データベース３２を参照し、ロボット１に送信する。 The inter-robot connection management unit 31 of the server 3 refers to the connection management database 32 and transmits the audio signal transmitted from the robot 2 to the robot 1.

ロボット１では、音声編集部１２により、受信した音声信号をロボット１の発話として編集して、編集音声信号を生成する。例えば、受信した「いいね。食べたい。」の音声信号を、ロボット１の発話として編集して、編集音声信号を生成する。そして、編集音声信号は、スピーカ１３から出力される。例えば、ロボット１の特有の音声で、「いいね。食べたい。」が出力される。 In the robot 1, the voice editing unit 12 edits the received voice signal as an utterance of the robot 1 to generate an edited voice signal. For example, the received speech signal “Like. I want to eat” is edited as the utterance of the robot 1 to generate an edited speech signal. Then, the edited audio signal is output from the speaker 13. For example, “Like, I want to eat” is output with the voice unique to the robot 1.

次に、ユーザＡは、ロボット１が発した音声に返答して、ロボット１に向かって話しかける。そのユーザＡの音声はマイク１１で集音され、その音声信号がサーバ３に送信される。例えば、ユーザＡが「いいね。食べたい。」に対して、「あ。パイ生地買い忘れた。」と返答した場合、「あ。パイ生地買い忘れた。」の音声信号がサーバ３に送信される。 Next, the user A responds to the voice uttered by the robot 1 and speaks toward the robot 1. The voice of the user A is collected by the microphone 11 and the voice signal is transmitted to the server 3. For example, when the user A responds to “Like. I want to eat.”, A voice signal “A. I forgot to buy puff pastry.” Is sent to the server 3. Is done.

ロボット２では、音声編集部１２により、受信した音声信号をロボット２の発話として編集して、編集音声信号を生成する。例えば、受信した「あ。パイ生地買い忘れた。」の音声信号を、ロボット２の発話として編集して、編集音声信号を生成する。そして、編集音声信号は、スピーカ１３から出力される。例えば、ロボット２の特有の音声で、「あ。パイ生地買い忘れた。」が出力される。 In the robot 2, the voice editing unit 12 edits the received voice signal as an utterance of the robot 2 to generate an edited voice signal. For example, the received audio signal “A. I forgot to buy pie dough” is edited as the utterance of the robot 2 to generate an edited audio signal. Then, the edited audio signal is output from the speaker 13. For example, “Ah, I forgot to buy puff pastry” is output by a voice specific to the robot 2.

続いて、ユーザＢは、ロボット２が発した音声に返答して、ロボット２に向かって話しかける。そのユーザＢの音声はマイク１１で集音され、その音声信号がサーバ３に送信される。例えば、ユーザＢが「あ。パイ生地買い忘れた。」に対して、「残念。今度ね。」と返答した場合、「残念。今度ね。」の音声信号がサーバ３に送信される。 Subsequently, the user B responds to the voice uttered by the robot 2 and speaks toward the robot 2. The voice of the user B is collected by the microphone 11 and the voice signal is transmitted to the server 3. For example, when the user B responds to “Ah, I forgot to buy the pie dough”, the voice signal “Sorry. This time” is transmitted to the server 3.

最後に、ロボット１では、音声編集部１２により、受信した音声信号をロボット１の発話として編集して、編集音声信号を生成する。例えば、受信した「残念。今度ね。」の音声信号を、ロボット１の発話として編集して、編集音声信号を生成する。そして、編集音声信号は、スピーカ１３から出力される。例えば、ロボット１の特有の音声で、「残念。今度ね。」が出力される。 Finally, in the robot 1, the voice editing unit 12 edits the received voice signal as an utterance of the robot 1 to generate an edited voice signal. For example, the received voice signal of “sorry. This time” is edited as the utterance of the robot 1 to generate an edited voice signal. Then, the edited audio signal is output from the speaker 13. For example, “unfortunate.

このように、辞書型や、ログ型、マルコフ文生成型などを用いたエンジンに比べて、ロボットを介して人間が会話しているので、自然な会話を行うことができる。また、人間が直接会話する相手はロボットなので、人間同士の会話とは異なる雰囲気で会話を行うことができる。 Thus, compared to engines using a dictionary type, a log type, a Markov sentence generation type, etc., since humans are talking through a robot, natural conversation can be performed. Moreover, since the person with whom a person talks directly is a robot, the conversation can be performed in an atmosphere different from the conversation between persons.

次に、上述した第１の実施の形態の変形例を説明する。 Next, a modification of the above-described first embodiment will be described.

図５はサーバ３の変形例を示した図である。 FIG. 5 is a diagram showing a modification of the server 3.

第１の実施の形態の変形例は、第１の実施の形態におけるサーバ３のロボット間接続管理部３１に加えて、その中に、マッチング制御部３３を有している。 The modification of the first embodiment includes a matching control unit 33 in addition to the inter-robot connection management unit 31 of the server 3 in the first embodiment.

また、接続管理データベース３２には、第１の実施の形態の内容に加えて、ロボットの所有者であるユーザの属性情報が記憶されている。図６に、第１の実施の形態の変形例における接続管理データベース３２の一例を示す。ここで、ユーザ属性情報とは、ロボットを所有するユーザの年齢、性別、住所、趣味等である。 In addition to the contents of the first embodiment, the connection management database 32 stores attribute information of the user who is the owner of the robot. FIG. 6 shows an example of the connection management database 32 in the modification of the first embodiment. Here, the user attribute information is the age, sex, address, hobby, etc. of the user who owns the robot.

マッチング制御部３３は、接続管理データベース３２のユーザ属性情報に基づいて、ロボットを介して会話を希望するユーザの間のマッチングを行い、ロボット間の接続を確立する。例えば、ユーザＡが２０代の女性であり、同様な女性のユーザとの会話を望む場合、マッチング制御部３３は、接続管理データベース３２のユーザ属性情報を参照し、２０代の女性で、かつ、切断中のロボット識別情報を検索し、それに対応するロボットとの接続の確立を試みる。接続の確立に成功した場合は、上述した会話動作を開始する。接続の確立に不成功の場合は、他のロボットを検索し、接続の確立を試みる。 Based on the user attribute information in the connection management database 32, the matching control unit 33 performs matching between users who wish to have a conversation via the robot and establishes a connection between the robots. For example, when the user A is a woman in her 20s and wants to have a conversation with a similar female user, the matching control unit 33 refers to the user attribute information in the connection management database 32, is a woman in her 20s, and Search the identification information of the robot being cut and attempt to establish a connection with the corresponding robot. If the connection is successfully established, the above-described conversation operation is started. If connection establishment is unsuccessful, another robot is searched and connection establishment is attempted.

このような構成を取ることにより、ロボットのユーザの会話する相手の希望をかなえることができる。 By adopting such a configuration, it is possible to fulfill the wishes of the conversation partner of the robot user.

更に、ロボット１が有する音声編集部１２を、サーバ３が有しても良い。この場合、サーバ３は、受信したユーザの音声信号を、ロボット２用に編集し、ロボット２に送信する。そして、ロボット２は、編集された音声信号を、スピーカ１３から出力する。 Further, the server 3 may include the voice editing unit 12 included in the robot 1. In this case, the server 3 edits the received voice signal of the user for the robot 2 and transmits it to the robot 2. Then, the robot 2 outputs the edited audio signal from the speaker 13.

このように構成しても、上述した第１の実施の形態と同様な効果を得ることができる。 Even if comprised in this way, the effect similar to 1st Embodiment mentioned above can be acquired.

＜第２の実施の形態＞
本発明の第２の実施の形態を説明する。 <Second Embodiment>
A second embodiment of the present invention will be described.

図７は第２の実施の形態におけるサーバ３のブロック図である。 FIG. 7 is a block diagram of the server 3 in the second embodiment.

サーバ３は、第１の実施の形態に加えて、音声認識部３４と、フィルタリング部３５と、ＮＧワードデータベース３６とを備える。 The server 3 includes a speech recognition unit 34, a filtering unit 35, and an NG word database 36 in addition to the first embodiment.

ＮＧワードデータベース３６は、会話をするにあたって、不適切な用語（以下、禁止用語と記載する）群が格納されたデータベースである。 The NG word database 36 is a database in which a group of inappropriate terms (hereinafter referred to as prohibited terms) is stored in a conversation.

音声認識部３４は、ロボット１から送信されてきた音声信号を、従来からある音声認識の技術を用いてテキスト化する。 The voice recognition unit 34 converts the voice signal transmitted from the robot 1 into text using a conventional voice recognition technique.

フィルタリング部３５は、テキスト化されたユーザの音声の文字列に禁止用語が含まれている場合、ユーザの該当する音声を削除、または、その該当する音声を他の用語に変換する。そして、削除、変換された音声を、通信先のロボットに転送する。 When the prohibited word is included in the text string of the user's voice, the filtering unit 35 deletes the corresponding voice of the user or converts the corresponding voice to another term. Then, the deleted and converted voice is transferred to the communication destination robot.

このようにすることにより、会話中に不適切な言葉が発せられた場合であっても、会話の相手に、不快な思いをさせることがない。 In this way, even if inappropriate words are uttered during the conversation, the conversation partner is not made uncomfortable.

＜第３の実施の形態＞
本発明の第３の実施の形態を説明する。 <Third Embodiment>
A third embodiment of the present invention will be described.

第３の実施の形態の特徴的な点は、従来からある、辞書型（シナリオ型）、ログ型、マルコフ文生成型（テキスト生成型）等を用いた会話エンジンの基本となるデータを収集できることである。特に、ログ型の場合、人間とロボットとの間で、自然の会話を成立させるためには、多くの会話ログの収集が不可欠である。そこで、第３の実施の形態では、第１の実施の形態におけるコミュニケーションシステムを用いて、その会話中に得られた会話ログを収集することを特徴とする。 A characteristic point of the third embodiment is that it is possible to collect basic data of a conversation engine using a dictionary type (scenario type), a log type, a Markov sentence generation type (text generation type), etc. It is. In particular, in the case of the log type, it is indispensable to collect a large number of conversation logs in order to establish a natural conversation between a human and a robot. Therefore, the third embodiment is characterized in that conversation logs obtained during the conversation are collected using the communication system in the first embodiment.

図８は第３の実施の形態におけるサーバ３のブロック図である。 FIG. 8 is a block diagram of the server 3 according to the third embodiment.

第３の実施の形態におけるサーバ３は、第１、２の実施の形態に加えて、会話ログ収集部３７と、会話ログデータベース３８とを備える。 The server 3 in the third embodiment includes a conversation log collection unit 37 and a conversation log database 38 in addition to the first and second embodiments.

会話ログ収集部３７は、音声認識部３４にテキスト化されたロボット１，２間で行われる会話を、会話ログデータベース３８に会話ログとして収集する。 The conversation log collection unit 37 collects conversations between the robots 1 and 2 converted into text in the voice recognition unit 34 in the conversation log database 38 as a conversation log.

これにより、多くの自然なやり取りの会話ログが収集できる。そして、収集した会話ログは、ログ型の会話エンジンなどのデータに用いることができる。 Thereby, conversation logs of many natural exchanges can be collected. The collected conversation log can be used for data such as a log-type conversation engine.

尚、上述した実施の形態では、各部をハードウェアで構成したが、上述した動作の処理を情報処理装置（ＣＰＵ）に行わせるプログラムによっても構成できる。 In the above-described embodiment, each unit is configured by hardware, but may be configured by a program that causes the information processing apparatus (CPU) to perform the above-described operation processing.

以上好ましい実施の形態をあげて本発明を説明したが、本発明は必ずしも上記実施の形態に限定されるものではなく、その技術的思想の範囲内において様々に変形し実施することが出来る。 Although the present invention has been described with reference to the preferred embodiments, the present invention is not necessarily limited to the above-described embodiments, and various modifications can be made within the scope of the technical idea.

１ロボット
２ロボット
３サーバ
１１マイク
１２音声編集部
１３スピーカ
１４制御部
３１ロボット間接続管理部
３２接続管理データベース
３３マッチング制御部
３４音声認識部
３５フィルタリング部
３６ＮＧワードデータベース
３７会話ログ収集部
３８会話ログデータベース DESCRIPTION OF SYMBOLS 1 Robot 2 Robot 3 Server 11 Microphone 12 Voice editing part 13 Speaker 14 Control part 31 Inter-robot connection management part 32 Connection management database 33 Matching control part 34 Voice recognition part 35 Filtering part 36 NG word database 37 Conversation log collection part 38 Conversation log The database

Claims

A communication system,
A robot installed on the user side and a server;
The robot is
A microphone that collects the user's voice;
A transmission / reception means for transmitting a voice signal of a user's voice collected by the microphone to the server through a network, and receiving a user's voice signal transmitted through the network;
A voice editing means for editing a user's voice signal sent through the network as an utterance of the robot and generating an edited voice signal;
At least one speaker for outputting the edited audio signal;
Have
The server
A communication system comprising connection management means for managing transmission and reception of signals of the robot.

The server
A user attribute information database storing attribute information of a user who owns the robot;
The communication system according to claim 1, further comprising: a matching control unit that refers to the user attribute information database, performs user matching based on user attribute information, and establishes a connection between robots corresponding to the matched users.

The server
A banned term database that contains banned terms for conversations;
Voice recognition means for recognizing and converting a user's voice signal sent through the network into text;
Filtering means for referring to the prohibited term database, searching for a prohibited term from a character string in which the user's voice is converted into text, and deleting the user's voice signal or converting a part thereof when the prohibited term is included. The communication system according to claim 1 or 2, comprising:

The server
Voice recognition means for recognizing and converting a user's voice signal sent through the network into text;
The communication system according to claim 1, further comprising: a conversation log collecting unit that collects a character string in which the user's voice is converted into a text as a conversation log.

A conversation log collection system in a communication system,
A robot installed on the user side and a server;
The robot is
A microphone that collects the user's voice;
A transmission / reception means for transmitting a voice signal of a user's voice collected by the microphone to the server through a network, and receiving a user's voice signal transmitted through the network;
A voice editing means for editing a user's voice signal sent through the network as an utterance of the robot and generating an edited voice signal;
At least one speaker for outputting the edited audio signal;
Have
The server
Connection management means for managing transmission and reception of signals of the robot;
Voice recognition means for recognizing and converting a user's voice signal sent through the network into text;
A conversation log collection system comprising conversation log collection means for collecting, as a conversation log, a character string in which the user's voice is converted into text.

A conversation log collection server,
A microphone for collecting user's voice; and a transmission / reception means for transmitting a voice signal of the user's voice collected by the microphone to the server through a network and receiving the user's voice signal sent through the network A robot having voice editing means for editing a voice signal of a user sent through a network as an utterance of the robot and generating an edited voice signal; and at least one speaker for outputting the edited voice signal Connection management means for managing transmission and reception of signals between
Voice recognition means for recognizing and converting a user's voice signal sent through the network into text;
A conversation log collection server having a conversation log collection means for collecting a character string in which the user's voice is converted into a text as a conversation log.

A communication method,
The robot installed on the uttering user's side collects the user's voice with a microphone,
The robot installed on the side of the user who speaks transmits the voice signal of the user's voice collected by the microphone to the server through the network,
The server transmits an audio signal of the received user's voice to the destination robot,
The destination robot receives the user's voice signal sent through the network,
The destination robot edits the user's voice signal as the robot's utterance to generate an edited voice signal,
A communication method in which the destination robot outputs the edited audio signal.

The server
The user attribute information database storing the attribute information of the user who owns the robot is referred to, the user is matched based on the attribute information of the user, and the connection is established between the robots corresponding to the matched user. The communication method described.

The server
Voice recognition of user's voice signal sent through the network is recognized and converted into text.
Refer to a prohibited term database in which prohibited terms of conversation are stored, and search for prohibited terms from a text string in which the user's voice is converted to text. The communication method according to claim 7 or 8, wherein the part is converted.

The server
Voice recognition of user's voice signal sent through the network is recognized and converted into text.
The communication method according to claim 7, wherein character strings in which the user's voice is converted into text are collected as a conversation log.