JP2013086226A

JP2013086226A - Communication robot

Info

Publication number: JP2013086226A
Application number: JP2011230562A
Authority: JP
Inventors: Hirotada Ueda; 博唯上田
Original assignee: Kyoto Sangyo University
Current assignee: Kyoto Sangyo University
Priority date: 2011-10-20
Filing date: 2011-10-20
Publication date: 2013-05-13

Abstract

PROBLEM TO BE SOLVED: To provide a communication robot which allows a user to more easily to talk with the robot and to have friendly feeling therefor when speaking to the robot.SOLUTION: The communication robot includes voice detection means detecting spoken words uttered by the user and reaction expressing means expressing prescribed responsive reactions. The communication robot is made to react with spoken words of the user and to express responsive reactions of a plurality of the predetermined patterns. At this time, the robot carries out at least one of the determination of whether the spoken words are normal expressions, whether the spoken words are agreement request expressions, and whether the spoken words are assertive expressions, and the robot is made to express different responsive reactions based on determination results.

Description

本発明は、検知した音声から身体動作を生成するコミュニケーションロボットに関するものである。 The present invention relates to a communication robot that generates a body motion from detected voice.

人間とのコミュニケーションを目的としたコミュニケーションロボット（以下単に「ロボット」とも称す）の開発が進められている。この種のロボットには、言葉によるバーバルコミュニケーションを実施するにとどまらず、表情を変化させる、ジェスチャーを行うといった身体動作によるノンバーバルコミュニケーション（以下「聞き手動作」とも称す）を実施するものがある。 Development of communication robots (hereinafter also simply referred to as “robots”) for the purpose of communicating with human beings is underway. This type of robot includes not only verbal communication by words but also non-verbal communication by body movement (hereinafter referred to as “listener movement”) such as changing facial expressions and performing gestures.

このような聞き手動作を実施するロボットによると、人間がロボットに話しかけるとき、ロボットに聞く姿勢が形成される。このことにより、話し手である人間は、あたかも人間と話しているかのような自然なコミュニケーションを実感でき、物体に話しかけているという違和感を覚えることがない。そのため、ロボットに話しかけるとき、人間がより話し易くなるという所謂引き込み効果を奏することができる。このようなロボットが、例えば、非特許文献１に開示されている。 According to the robot that performs such a listener operation, when a human talks to the robot, a posture to listen to the robot is formed. As a result, a person who is a speaker can feel natural communication as if he / she is talking to a person, and does not feel uncomfortable that he / she is talking to an object. Therefore, when talking to the robot, a so-called pull-in effect that makes it easier for humans to speak can be achieved. Such a robot is disclosed in Non-Patent Document 1, for example.

渡辺富夫、「身体的コミュニケーションにおける引き込みと身体性心が通う身体的コミュニケーションシステムＥ−ＣＯＳＭＩＣの開発を通して」、「日本赤ちゃん学会」学会誌「ベビーサイエンス２００２.Ｖｏｌ．２」論文１、２００３年５月発刊Tomio Watanabe, “With the development of physical communication system E-COSMIC where physical communication is drawn and physicality”, “Baby Science 2002. Vol. Publication

ところで、近年、人とロボットの共生に係る研究が注目されている。即ち、病院等の公共施設、レストラン等の商業施設、学校等の教育施設、ゲームセンター等の娯楽施設、ケアセンター等の介護施設といった、社会における様々な人の生活環境において、人の知識獲得の手助けをしたり、人と共に労働したりするロボットの開発が広く行われている。 In recent years, research related to the coexistence of humans and robots has attracted attention. In other words, people's knowledge acquisition in various living environments such as hospitals and other public facilities, restaurants and other commercial facilities, schools and other educational facilities, game centers and other recreational facilities, and care centers and other nursing homes Robots that help or work with people are widely developed.

本発明者らは、このような人と共生するロボットとして、人間の社会生活における感情的な面をサポートするロボットの開発を考えた。具体的には、社会生活によって人間が受けるストレスをロボットの動作によって解消することを考えた。 The present inventors considered the development of a robot that supports the emotional aspect of human social life as a robot that coexists with such a person. Specifically, we considered that the stress of human beings due to social life can be eliminated by robot movement.

本発明者らがストレスを解消する手段について調査した結果、愚痴を聴いてもらうことがストレスの解消に効果的であることが判明した。しかしながら、人がロボットに対して愚痴を言うとき、ロボットが従来の聞き手動作を実施したのでは、人は十分にストレスを解消することができなかった。 As a result of the inventors investigating the means for relieving stress, it has been found that listening to bitches is effective in relieving stress. However, when a person complains about a robot, if the robot performs a conventional listener operation, the person cannot sufficiently relieve stress.

このことにつき、以下のような理由が考えられる。一般的に愚痴をこぼすとき、話し手は、聴き手が相手の発言を傾聴するという態度を見せることにより、愚痴をこぼし易くなる。また、話し手は、聴き手が強く同意してくれることにより、わだかまりが消えて気持ちがよくなる。換言すると、愚痴を聴くとき、聴き手には、通常の会話を実施するときより、話し手に対して強い共感が求められる。
しかし、従来のロボットが実施する聞き手動作では、話し手である使用者に対し、発話内容に共感していることを十分に感じさせることができなかった。 The following reasons can be considered for this. In general, when complaining, a speaker can easily complain by showing the attitude that the listener listens to the other person's remarks. Also, if the listener strongly agrees with the listener, the wad will disappear and the feeling will improve. In other words, when listening to bitches, the listener is required to have a stronger empathy for the speaker than when conducting a normal conversation.
However, in the listener operation performed by the conventional robot, the user who is a speaker cannot fully feel that he / she is sympathetic with the utterance content.

そこで本発明は、使用者がロボットに向けて発話するとき、使用者がより話し易く、より親しみを感じることができるコミュニケーションロボットの提供を課題とするものである。 Therefore, an object of the present invention is to provide a communication robot that allows a user to speak more easily and feel more familiar when the user speaks toward the robot.

上記課題を解決するための本発明の一態様は、使用者が発する話し言葉を検知する音声検知手段と、使用者が視覚的及び／又は聴覚的に関知できる所定の応答反応を表出する反応表出手段とを有し、使用者の話し言葉に反応して反応表出手段が予め定められた複数パターンの応答反応を表出するコミュニケーションロボットであって、前記音声検知手段が検知した話し言葉の内容を判定する音声判定手段を有し、当該音声判定手段は、前記話し言葉が通常表現であるか否かの判定、前記話し言葉が同意要求表現であるか否かの判定、前記話し言葉が断定表現であるか否かの判定の少なくともいずれかの判定が可能であり、音声判定手段の判定結果に基づいて、異なる応答反応が表出されることを特徴とするコミュニケーションロボットである。 One aspect of the present invention for solving the above-described problems is that a voice detection unit that detects spoken words emitted by a user and a reaction table that expresses a predetermined response response that the user can visually and / or auditorily know. A communication robot that responds to a user's spoken word and the response expression means expresses a plurality of patterns of response responses determined in advance, the content of the spoken word detected by the voice detection means Voice determination means for determining, wherein the voice determination means determines whether or not the spoken word is a normal expression, determines whether or not the spoken word is a consent request expression, and whether or not the spoken word is an assertive expression It is a communication robot capable of determining at least one of determination of whether or not, and displaying different response responses based on the determination result of the sound determination means

本発明の一態様では、話し言葉の内容を判定する音声判定手段を有しており、話し手である使用者の話し言葉が、どのような内容であるのかを判定することができる。そして、判定した話し言葉の内容に応じた適切な応答反応を表出することができる。例えば、使用者の会話の内容によって、相槌を打つ、同意する、より強く同意する等の動作を切り替えることができる。このことにより、ロボットの応答が人間（動物）の応答に近いものとなり、使用者との間により自然なコミュニケーションが成立する。このことにより、使用者がロボットに語りかけるとき、より話し易くなるという効果がある。
さらに、使用者の話の内容に応じた適切な応答反応を表出することにより、使用者に対し、「話し相手（ロボット）が自分の意見に強く共感していること」を感じさせることができる。具体的には、使用者が語りかけているとき、適切なタイミングで相槌を入れる等の所定の動作を実施すると、使用者は話し相手（ロボット）が自分の話しを傾聴しているように感じる。そして、所定の動作が、例えば首を縦に振るといった、承認や同意を示す動作であった場合、使用者は、話し相手（ロボット）が話しの内容に同意したかのように感じる。そして、使用者は、話し相手（ロボット）が自分の話しを熱心に聴き、同意してくれることから、話し相手（ロボット）が自分の意見に強く共感していると感じる。このことから、使用者が話し相手（ロボット）に対してより親しみを感じるという効果がある。
なお、ここでいうロボットとは、プログラムによって動作する有形の物体（機器、ハードウェア、人形）を含むものである。 In one aspect of the present invention, a voice determination unit that determines the content of a spoken word is provided, and the content of the spoken word of a user who is a speaker can be determined. Then, it is possible to express an appropriate response response according to the content of the determined spoken language. For example, depending on the content of the user's conversation, operations such as hitting, agreeing, and strongly agreeing can be switched. As a result, the response of the robot is close to that of a human (animal), and natural communication is established with the user. This has the effect of making it easier to talk when the user talks to the robot.
Furthermore, by displaying an appropriate response response according to the content of the user's story, the user can feel that the other party (robot) is strongly sympathizing with his / her opinion. . Specifically, when the user is speaking, if the user performs a predetermined operation such as putting a match at an appropriate timing, the user feels that the other party (robot) is listening to his / her speech. When the predetermined operation is an operation indicating approval or consent, for example, shaking the head vertically, the user feels as if the talking partner (robot) has agreed to the content of the talk. And the user feels that the other party (robot) is strongly sympathetic to his / her opinion because the other party (robot) listens and agrees eagerly. From this, there is an effect that the user feels more familiar with the talking partner (robot).
Here, the robot includes a tangible object (device, hardware, doll) that operates according to a program.

本発明の第二の態様は、使用者が発する話し言葉を検知する音声検知手段と、使用者が視覚的及び／又は聴覚的に関知できる所定の応答反応を表出する反応表出手段とを有し、使用者の話し言葉に反応して反応表出手段が予め定められた複数パターンの応答反応を表出するコミュニケーションロボットであって、前記音声検知手段が検知した話し言葉の内容を判定する音声判定手段を有し、当該音声判定手段は、使用者が発する話し言葉の語尾の音又は語尾近傍の音、話し言葉の長さ、声の強度の少なくともいずれかの情報に基づいて話し言葉の内容がどのような内容であるかを判定し、音声判定手段の判定結果に基づいて、異なる応答反応が表出されることを特徴とするコミュニケーションロボットである。 The second aspect of the present invention includes voice detection means for detecting a spoken language spoken by the user, and reaction expression means for expressing a predetermined response response that the user can visually and / or audibly know. And a voice determination means for determining the content of the spoken language detected by the voice detection means, wherein the response expression means responds to the user's spoken language and the response expression means expresses a plurality of predetermined response responses. The speech determination means includes the content of the spoken word based on at least one of the sound at the end of the spoken word or the sound near the word ending, the length of the spoken word, or the strength of the voice. The communication robot is characterized in that different response responses are expressed based on the determination result of the voice determination means.

本発明の第二の態様では、話し言葉の内容を判定する音声判定手段を有しており、使用者が発する話し言葉の語尾の音、語尾近傍の音、話し言葉の長さ、声の強度等の情報から、話し言葉の内容を判定することができる。そして、判定した話し言葉の内容に応じた適切な応答反応を表出することができる。即ち、上記した態様と同様に、話の内容によって応答反応を変えることで、ロボットの応答が人間の応答に近いものとなり、使用者との間により自然なコミュニケーションが成立する。このことにより、使用者がロボットに語りかけるとき、より話し易くなるという効果がある。
また、使用者の話の内容に応じた適切な応答反応を表出することにより、上記した態様と同様に、使用者に対して「話し相手（ロボット）が自分の意見に強く共感していること」を感じさせることができ、使用者に親しみを感じさせることができるという効果がある。 In the second aspect of the present invention, there is a voice determination means for determining the content of the spoken word, and information such as the sound at the end of the spoken word, the sound near the word ending, the length of the spoken word, the voice strength, etc. Therefore, the content of spoken language can be determined. Then, it is possible to express an appropriate response response according to the content of the determined spoken language. That is, similarly to the above-described aspect, by changing the response response according to the content of the story, the response of the robot becomes close to the response of a human, and natural communication with the user is established. This has the effect of making it easier to talk when the user talks to the robot.
In addition, by expressing an appropriate response response according to the content of the user's story, as with the above-described mode, the user is told that the other party (robot) is strongly sympathetic to his / her opinion. Can be felt, and the user can feel familiar.

本発明の第三の態様は、上記した態様のコミュニケーションロボットにおいて、前記音声判定手段は、使用者が発する話し言葉の終助詞に基づいて、話し言葉の内容を判定することを特徴とする。 According to a third aspect of the present invention, in the communication robot according to the aspect described above, the voice determination means determines the content of the spoken word based on the final particle of the spoken word uttered by the user.

本発明の第三の態様では、使用者が発する話し言葉の終助詞に基づいて話し言葉の内容を判定する。このことにより、文や句の全体から話し言葉の内容を判定する場合に比べて、ハードウェア実現が比較的容易なアルゴリズムで判定を実施することができる。また、文や句の全体から話し言葉の内容を判定しないので、実行時に素早く判定結果を得ることが可能となる。
なお、ここでいう終助詞とは、間投助詞を含むものであり、より具体的には、文や句の末尾について意味を付け加える品詞全般を含むものとする。さらに、英語、ドイツ語等の外国語において、付加疑問文の末尾の部分のような、文や句の末尾に付加されて意味を付け加える部分についても含むものとする。 In the third aspect of the present invention, the content of the spoken word is determined based on the final particle of the spoken word emitted by the user. As a result, it is possible to carry out the determination with an algorithm that is relatively easy to implement as compared with the case where the content of spoken words is determined from the entire sentence or phrase. Moreover, since the content of the spoken word is not determined from the whole sentence or phrase, it is possible to obtain a determination result quickly at the time of execution.
The term final particle as used herein includes an interposition particle, and more specifically includes a general part of speech that adds meaning to the end of a sentence or phrase. Furthermore, in a foreign language such as English or German, a part added to the end of a sentence or phrase, such as a part at the end of an additional question sentence, is added.

本発明の第四の態様は、上記した態様のコミュニケーションロボットにおいて、前記応答反応は、特定の動作の実施回数を変化させることで、異なる応答反応とするものであり、音声判定手段の判定結果に基づいて、前記特定の動作を実施する回数が変化することを特徴とする。 According to a fourth aspect of the present invention, in the communication robot according to the aspect described above, the response reaction is a different response reaction by changing the number of executions of a specific operation. Based on this, the number of times of performing the specific operation varies.

本発明の第四の態様では、特定の動作の実施回数を変化させることで、異なる応答反応とする。つまり、同じ動作を繰り返すことにより、動作が示す意味に強弱をつける。例えば、ロボットが１回頷く動作を実施する場合と、ロボットが複数回頷く動作を実施する場合とでは、後者の方が強く同意していることを示すことができる。したがって、同じ動作の回数の違いによって、同意していることを表現する応答と、強く同意していることを表現する応答の２通りの応答を行うことができる。このように、特定の動作の実施回数を変化させることで、異なる応答反応とすると、使用者の話し言葉に対する応答反応がより人間の応答に近いものとなる。 In the fourth aspect of the present invention, different response responses are obtained by changing the number of executions of a specific operation. That is, by repeating the same operation, the meaning indicated by the operation is strengthened. For example, it can be shown that the latter agrees more strongly in the case where the robot performs the operation of whispering once and the case where the robot performs the operation of whispering multiple times. Therefore, depending on the difference in the number of times of the same operation, two types of responses can be performed: a response expressing that the user agrees and a response expressing that the user strongly agrees. In this way, by changing the number of executions of a specific operation, if the response response is different, the response response to the user's spoken language becomes closer to a human response.

本発明の第五の態様は、上記した態様のコミュニケーションロボットにおいて、前記音声判定手段は、使用者が発する話し言葉の語尾が下記の（１）〜（４）の語から選ばれた１又は２以上の語である場合に、同意要求表現と判定することを特徴とする。
（１）「ね」
（２）「う」
（３）「お」
（４）「か」 According to a fifth aspect of the present invention, in the communication robot according to the aspect described above, the voice determination means includes one or more selected from the following words (1) to (4) whose endings of the spoken language the user utters: If it is a word, it is determined that it is a consent request expression.
(1) "Ne"
(2) "U"
(3) "O"
(4) "ka"

本発明の第五の態様では、使用者が発する話し言葉の語尾が上記の（１）〜（４）の語から選ばれた１又は２以上の語である場合に、同意要求表現と判定する。
使用者が日本語で発話し、話し言葉の語尾が上記の（１）〜（４）の語から選ばれた１又は２以上の語である場合、使用者が話した内容について同意を要求している可能性が十分に高い。したがって、このような場合に、話し言葉の内容が同意要求表現であると判定することで、話し言葉の内容を判定する判定動作の精度を高めることができる。 In the fifth aspect of the present invention, the consent request expression is determined when the ending of the spoken language given by the user is one or more words selected from the words (1) to (4).
If the user speaks in Japanese and the ending of the spoken word is one or more words selected from the words (1) to (4) above, request consent from the user The possibility is high enough. Therefore, in such a case, it is possible to improve the accuracy of the determination operation for determining the content of the spoken word by determining that the content of the spoken word is the consent request expression.

本発明の第六の態様は、上記した態様のコミュニケーションロボットにおいて、前記音声判定手段は、使用者が発する話し言葉の語尾が下記の（５）及び（６）の語から選ばれた１又は２の語である場合に、断定表現と判定することを特徴とする。
（５）「だ」
（６）「や」 According to a sixth aspect of the present invention, in the communication robot according to the aspect described above, the voice determination unit is configured such that the ending of a spoken word uttered by the user is selected from the following words (5) and (6): If it is a word, it is determined to be an assertive expression.
(5) "Da"
(6) "Ya"

本発明の第六の態様では、使用者が発する話し言葉の語尾が上記の（５）及び（６）の語から選ばれた１又は２の語である場合に、断定表現と判定する。
使用者が日本語で発話し、話し言葉の語尾が上記の（５）及び（６）の語から選ばれた１又は２の語である場合、使用者が話した内容について断定している可能性が十分に高い。したがって、このような場合に、話し言葉の内容が断定表現であると判定することで、話し言葉の内容を判定する判定動作の精度を高めることができる。 In the sixth aspect of the present invention, when the ending of a spoken word uttered by the user is one or two words selected from the above words (5) and (6), it is determined as an asserted expression.
If the user speaks in Japanese and the ending of the spoken word is one or two words selected from the words (5) and (6) above, there is a possibility that the content spoken by the user has been determined Is high enough. Therefore, in such a case, it is possible to improve the accuracy of the determination operation for determining the content of the spoken word by determining that the content of the spoken word is the asserted expression.

本発明の第七の態様は、上記した態様のコミュニケーションロボットにおいて、人又は動物を模した形状であることを特徴とする。 According to a seventh aspect of the present invention, in the communication robot according to the aspect described above, the communication robot has a shape imitating a person or an animal.

本発明の第七の態様では、ロボットの形状が、一般的に人が親しみ易く、話しかけ易い人間や動物、又はそれらの特徴を誇張、強調して簡略化、省略化した形状となっている。このことにより、使用者がロボットに語りかけるとき、物体に話しかけているという使用者の違和感をより確実に解消でき、使用者がより話し易くなるという効果がある。
なお、ここでいう「人又は動物」には、映画、小説、漫画等の創作物等の実在しない登場人物、対称物を擬人化したものも含むものとする。 In the seventh aspect of the present invention, the shape of the robot is generally simplified and omitted by exaggerating and emphasizing humans and animals, or their characteristics, which are generally familiar to people and easy to talk to. As a result, when the user talks to the robot, the user's uncomfortable feeling that he / she is talking to the object can be more reliably eliminated, and the user can speak more easily.
Here, “human or animal” includes non-existing characters such as movies, novels, comics, etc., and anthropomorphic symmetric objects.

本発明の第八の態様は、上記した態様のコミュニケーションロボットにおいて、前記応答反応は、下記の（１）〜（３）の動作から選ばれた１又は２以上の動作の複合であることを特徴する。
（１）頷く
（２）鳴く
（３）四股のいずれかを動かす According to an eighth aspect of the present invention, in the communication robot according to the aspect described above, the response reaction is a composite of one or more operations selected from the following operations (1) to (3). To do.
(1) whisper (2) cry (3) move one of the four legs

本発明の第八の態様では、ロボットが、あたかも人や動物が行う動作のような応答反応を表出するため、使用者がロボットに語りかけるとき、使用者にあたかも人や動物に語りかけているような印象を与えることができる。このため、使用者は、物体に話しかけているという違和感を覚えることがなく、より話し易くなるという効果がある。 In the eighth aspect of the present invention, since the robot expresses a response response as if a human or animal performs, when the user talks to the robot, the robot seems to talk to the human or animal. Can give an impression. For this reason, the user does not feel uncomfortable that he / she is talking to an object, and the user can speak more easily.

本発明の第九の態様は、上記した態様のコミュニケーションロボットにおいて、前記音声判定手段は、使用者が発する話し言葉の語尾が下記の（１）〜（４）の語から選ばれた１又は２以上の語である場合に、同意要求表現と判定し、使用者が発する話し言葉の語尾が下記の（５）及び（６）の語から選ばれた１又は２の語である場合に、断定表現と判定し、使用者が発する話し言葉が同意要求表現及び断定表現でない場合、通常表現と判定するものであり、前記応答反応は、頷く動作の実施回数を変化させることで、異なる応答反応とするものであって、前記同意要求表現と判定された場合に実施する頷く動作の回数は、前記通常表現と判定された場合に実施する頷く動作の回数より多く、前記断定表現と判定された場合に実施する頷く動作の回数は、前記同意要求表現と判定された場合に実施する頷く動作の回数より多いことを特徴する。
（１）「ね」
（２）「う」
（３）「お」
（４）「か」
（５）「だ」
（６）「や」 According to a ninth aspect of the present invention, in the communication robot according to the aspect described above, the voice determination means includes one or two or more selected from the following words (1) to (4) whose endings of the spoken language the user utters: It is determined that it is a consent request expression, and when the ending of the spoken word that the user utters is one or two words selected from the following words (5) and (6), When the spoken language that the user utters is not the consent request expression and the assertion expression, it is determined as a normal expression, and the response response is a different response response by changing the number of times the crawl operation is performed. In this case, the number of snooping operations to be performed when it is determined as the consent request expression is larger than the number of snooping operations to be performed when it is determined as the normal expression, and is performed when it is determined as the assertion expression. Moving The number is characterized by more than the number of operations nods to be carried out when it is determined that the consent request representation.
(1) "Ne"
(2) "U"
(3) "O"
(4) "ka"
(5) "Da"
(6) "Ya"

本発明の第九の態様では、話し言葉の内容を判定する判定動作の高い精度で実施可能であり、判定した話し言葉の内容に基づいて、あたかも人間のような応答反応を表出する。より具体的には、話し言葉の内容によって、相槌を打つように少ない回数だけ頷いたり、強い同意を示すように何度も頷いたりする。このことにより、ロボットの応答が人間の応答により近いものとなり、使用者との間により自然なコミュニケーションが成立する。このことにより、使用者がロボットに語りかけるとき、より話し易くなるという効果がある。さらに、話し言葉の内容に即した応答反応を表出することにより、上記したように、使用者に対し、「話し相手（ロボット）が自分の意見に強く共感していること」を感じさせることができる。このことから、使用者が話し相手（ロボット）に対してより親しみを感じるという効果がある。 In the ninth aspect of the present invention, the determination operation for determining the content of the spoken word can be performed with high accuracy, and a response response like a human is expressed based on the determined content of the spoken word. More specifically, depending on the content of the spoken language, it may be asked a few times so as to strike a conflict, or it may be asked many times so as to show a strong agreement. As a result, the response of the robot becomes closer to that of a human, and natural communication with the user is established. This has the effect of making it easier to talk when the user talks to the robot. Furthermore, by displaying a response response that matches the content of the spoken language, as described above, the user can feel that the other party (robot) is strongly sympathizing with their opinions. . From this, there is an effect that the user feels more familiar with the talking partner (robot).

本発明は、ロボットの応答が人間（動物）の応答に近いので、使用者との間により自然なコミュニケーションを成立させることができる。そのため、使用者がロボットに語りかけるとき、話し易くなるという効果がある。
また本発明は、使用者がロボットに語りかけたとき、ロボットが話の内容に応じた適切な応答反応を表出する。このことにより、使用者に対して、「話し相手（ロボット）が自分の意見に強く共感している」という印象を与えることができる。このことにより、使用者に、ロボットに対する親しみを感じさせることができるという効果がある。 In the present invention, since the response of the robot is close to that of a human (animal), more natural communication can be established with the user. Therefore, when the user talks to the robot, there is an effect that it becomes easy to speak.
Further, according to the present invention, when the user speaks to the robot, the robot expresses an appropriate response response according to the content of the story. This can give the user the impression that the other party (robot) is strongly sympathetic to his / her opinion. This has the effect of making the user feel familiar with the robot.

本発明の第１実施形態のロボットを示す斜視図である。1 is a perspective view showing a robot according to a first embodiment of the present invention. 図１のロボットの電気的構成を示すブロック図である。It is a block diagram which shows the electric constitution of the robot of FIG. 図１のロボットのコミュニケーション動作の動作手順を示すフローチャートである。It is a flowchart which shows the operation | movement procedure of the communication operation | movement of the robot of FIG. 図１のロボットとは異なる実施形態に係るロボットを示す斜視図である。It is a perspective view which shows the robot which concerns on embodiment different from the robot of FIG. 図１，４のロボットとは異なる実施形態に係るロボットを示す斜視図である。It is a perspective view which shows the robot which concerns on embodiment different from the robot of FIG.

以下さらに、本発明の各実施形態について説明するが、本発明はこれらの例に限定されるものではない。 Hereinafter, although each embodiment of the present invention is described, the present invention is not limited to these examples.

本発明の第１実施形態のロボット１（コミュニケーションロボット）は、使用者が話しかけると、話し言葉を発話情報として取得する。そして、取得した発話情報とメモリに記憶された行動プログラムに基づいて、規定のコミュニケーション動作を実施する。 The robot 1 (communication robot) of the first embodiment of the present invention acquires spoken words as utterance information when the user speaks. Then, based on the acquired utterance information and the action program stored in the memory, a prescribed communication operation is performed.

以下、このようなロボットについて詳細に説明する。 Hereinafter, such a robot will be described in detail.

ロボット１は、図１で示されるように、人を模した形状であって、胴体２と、胴体２と一体に取り付けられた頭部３、手部４、足部５から構成されている。 As shown in FIG. 1, the robot 1 has a shape imitating a person, and includes a body 2, a head 3, a hand 4, and a foot 5 attached to the body 2.

頭部３（反応表出手段）は、図示しないモータ、カム機構、クランク機構等から形成される機械構造部を介して、胴体２に取付けられている。そのため、頭部３は、胴体２に対して回動、姿勢を傾斜させる動作（応答反応）が可能となっている。即ち、頭部３は、頷く、首を振る等の動作が可能な状態で取付けられている。 The head 3 (reaction display means) is attached to the body 2 via a mechanical structure formed by a motor, a cam mechanism, a crank mechanism, etc. (not shown). Therefore, the head 3 can perform an operation (response reaction) for rotating and tilting the posture with respect to the body 2. In other words, the head 3 is attached in a state where it can be moved and shaken.

また、手部４、足部５も機械構造部を介して胴体２に取付けられており、胴体２に対して手部４の先端、足部５の先端が揺動可能となっている。即ち、手部４や足部５は、手を振る、足を上げる等の動作が可能な状態で取付けられている。 Further, the hand portion 4 and the foot portion 5 are also attached to the body 2 via the mechanical structure portion, and the tip of the hand portion 4 and the tip of the foot portion 5 can swing with respect to the body 2. That is, the hand part 4 and the foot part 5 are attached in a state in which operations such as shaking hands and raising a leg are possible.

次に、ロボット１の電気的構成について、図２を参照しつつ説明する。
ロボット１は、ＣＰＵ１５、メモリ１６、マイク１７（音声検知手段）を備えており、これらはバスを介して接続されている。また、ＣＰＵ１５は、バスを介して対話内容ＤＢ１８、対話行動ＤＢ１９に接続している。 Next, the electrical configuration of the robot 1 will be described with reference to FIG.
The robot 1 includes a CPU 15, a memory 16, and a microphone 17 (voice detection means), which are connected via a bus. Further, the CPU 15 is connected to the dialogue content DB 18 and the dialogue action DB 19 via a bus.

ＣＰＵ１５は、周知のＣＰＵであり、マイクロプロセッサとも称されるものであって、記憶装置上のプログラムを読み込み、実行することにより、情報の加工演算が可能なものである。 The CPU 15 is a well-known CPU and is also referred to as a microprocessor, and can process information by reading and executing a program on a storage device.

メモリ１６は、ＥＰＲＯＭやＥＥＰＲＯＭといったＲＯＭ、ＤＲＡＭやＳＤＲＡＭといったＲＡＭに加え、フラッシュメモリやＨＤＤといった補助記憶装置又は二次記憶装置と称されるものを含む記憶装置であって、これらを１又は複数組み合わせて構成されている。メモリ１６には、ロボット１の動作を制御するための音声解析部２２，対話行動実行部２３を主とする各種プログラム、データが記憶されている。なお、このメモリ１６は、ハードディスク、フロッピーディスク（登録商標）、ＭＯ、ＣＤ、ＤＶＤ、ＢＤ、磁気テープ等々の外部記憶装置とそれらの読取り装置によって構成してもよい。 The memory 16 is a storage device including what is called an auxiliary storage device or secondary storage device such as a flash memory or HDD in addition to a ROM such as an EPROM or an EEPROM, a RAM such as a DRAM or an SDRAM, and a combination of these. Configured. The memory 16 stores various programs and data mainly including a voice analysis unit 22 and a dialogue action execution unit 23 for controlling the operation of the robot 1. The memory 16 may be constituted by an external storage device such as a hard disk, a floppy disk (registered trademark), an MO, a CD, a DVD, a BD, a magnetic tape, and the like and a reading device thereof.

音声解析部２２は、マイク１７から送信された発話情報を表す信号と、後述する対話内容ＤＢ１８に格納された情報に基づいて、発話情報の内容（表現の種類）を特定する機能を有する。 The voice analysis unit 22 has a function of specifying the content (expression type) of speech information based on a signal representing speech information transmitted from the microphone 17 and information stored in a dialogue content DB 18 described later.

対話行動実行部２３は、音声解析部２２によって特定された発話情報の内容（表現の種類）と、後述する対話内容ＤＢ１８に格納された情報に基づいて、ロボット１に特定の動作を実行させる機能を有する。 The dialogue action execution unit 23 is a function for causing the robot 1 to execute a specific operation based on the content (expression type) of the utterance information identified by the voice analysis unit 22 and information stored in the dialogue content DB 18 described later. Have

マイク１７は、周知のマイクであって、外部音を集音して電気信号に変換可能なものである。 The microphone 17 is a well-known microphone and can collect external sound and convert it into an electrical signal.

対話内容ＤＢ１８には、会話の語尾となり得る語と、表現の種類とが関連付けられて記憶されている。本実施形態では「ね」，「う」，「お」，「か」，「だ」，「や」の６つの語、及びこれらを末尾の語とする語（語句）を含む語の群と、「同意要求表現」，「断定表現」を含む表現の種類の群とが関連付けられて記憶されている。さらに具体的には、少なくとも、「ね」，「う」，「お」，「か」の４つの語、及びこれらを末尾の語とする語と「同意要求表現」とが関連付けられており、「だ」，「や」の２つの語、及びこれらを末尾の語とする語と「断定表現」とが関連づけられて記憶されている。 The dialogue content DB 18 stores a word that can be the end of a conversation and an expression type in association with each other. In the present embodiment, a group of words including six words “ne”, “u”, “o”, “ka”, “da”, “ya”, and a word (phrase) having these as the last word , A group of expressions including “consent request expression” and “conclusive expression” are stored in association with each other. More specifically, at least four words “ne”, “u”, “o”, “ka”, and a word having these as the last word are associated with “consent request expression”. Two words “da” and “ya”, and a word having these words as the last word and a “definite expression” are stored in association with each other.

対話行動ＤＢ１９は、表現の種類とロボット１の動作とが関連付けられて記憶されている。本実施形態では「断定表現」，「同意要求表現」，「通常表現」を含む表現の種類の群と、「３回頷く」，「２回頷く」，「１回頷く」を含むロボット１の動作の群とが関連付けられて記憶されている。さらに具体的には、少なくとも、「断定表現」と「３回頷く」という動作、「同意要求表現」と「２回頷く」という動作、「通常表現」と「１回頷く」という動作が関連付けられて記憶されている。 The dialogue action DB 19 stores the type of expression and the operation of the robot 1 in association with each other. In this embodiment, the robot 1 includes a group of expression types including “conclusive expression”, “consent request expression”, “normal expression”, and “roaming three times”, “twisting twice”, and “single once”. A group of actions is stored in association with each other. More specifically, at least the operations of “declaration expression” and “3 times of singing”, “consent request expression” and “twice of two times”, “normal expression” and “single of once” are associated. Is remembered.

本発明のロボット１は、使用者が話しかけると、発話内容に応じてコミュニケーション動作を実施する。そして、このコミュニケーション動作によってロボット１の応答が人間の応答に近いものとなり、使用者とロボット１の間に自然なコミュニケーションが成立する。
本発明の特徴的動作たるコミュニケーション動作について、図３を参照しつつ、以下で詳細に説明する。 When the user speaks, the robot 1 of the present invention performs a communication operation according to the utterance content. This communication operation makes the response of the robot 1 close to a human response, and natural communication is established between the user and the robot 1.
A communication operation which is a characteristic operation of the present invention will be described in detail below with reference to FIG.

使用者がロボット１に対して話しかけると、ロボット１はマイク１７によって音声を発話情報として取得する（ステップ１）。そして、ＣＰＵ１５は、マイク１７から発話情報を表す信号が発せられたことを確認すると、音声解析部２２による発話情報の解析を実行する（ステップ２）。即ち、発話情報を必要に応じて文毎に区切り、抽出した１文の内容のうち、語尾の音、又は語尾近傍の音を特定する。そして、特定した語尾の音、又は語尾近傍の音と、対話内容ＤＢ１８に格納された情報とを比較する。
なお、発話情報を文毎に区切る場合、文と文の間の音声が発せられなかった時間や、発話情報における音の強弱（抑揚）、音声が発せられてからの時間等の情報に基づいて文を特定する。 When the user speaks to the robot 1, the robot 1 acquires voice as speech information by the microphone 17 (step 1). And if CPU15 confirms that the signal showing speech information was emitted from the microphone 17, the speech analysis part 22 will perform the analysis of speech information (step 2). That is, the utterance information is divided into sentences as necessary, and the ending sound or the sound near the ending is specified from the extracted contents of one sentence. Then, the sound of the specified ending or near the ending is compared with the information stored in the dialogue content DB 18.
In addition, when the speech information is divided for each sentence, it is based on information such as the time when the voice between the sentences was not emitted, the strength of the sound in the utterance information (inflection), the time since the voice was emitted, etc. Identify the sentence.

このとき、発話情報の語尾の音、又は語尾近傍の音が、「断定表現」に関連づけられた語であった場合（ステップ３でＹｅｓの場合）、ＣＰＵ１５は、対話行動実行部２３による断定対応動作を実施する（ステップ４）。即ち、発話情報の内容が「断定表現」であるとされた音声解析部２２による発話情報の解析結果と、対話行動ＤＢ１９に格納された情報から、発話表現に対してロボット１が実行すべき行動を特定し、特定した行動をロボット１に実行させる。より具体的には、発話情報から特定した文を構成する文節、又は各音のうち、文の最も末尾に位置する文節、又は音と、対話行動ＤＢ１９に格納された会話の語尾と成り得る語とを比較する。なお、本実施形態では、発話情報の内容が「断定表現」であることが特定された場合、ロボット１は「３回頷く」動作を実行する。 At this time, when the sound at the end of the utterance information or the sound near the end of the utterance is a word associated with the “definite expression” (Yes in step 3), the CPU 15 determines the correspondence by the interactive action execution unit 23. The operation is performed (step 4). That is, the action that the robot 1 should perform on the utterance expression based on the analysis result of the utterance information by the voice analysis unit 22 in which the content of the utterance information is “conclusive expression” and the information stored in the dialogue action DB 19. And the robot 1 is caused to execute the specified action. More specifically, the phrase that constitutes the sentence specified from the utterance information, or the sentence that is located at the end of the sentence, or the sound, and the word that can be the ending of the conversation stored in the dialogue action DB 19 And compare. In the present embodiment, when it is specified that the content of the utterance information is “conclusive expression”, the robot 1 performs the “three times” operation.

対して、発話情報の語尾の音、又は語尾近傍の音が、「断定表現」に関連づけられた語でなく（ステップ３でＮｏ）、「同意要求表現」に関連づけられた語であった場合（ステップ５でＹｅｓであった場合）、ＣＰＵ１５は、対話行動実行部２３による同意対応動作を実施する（ステップ６）。即ち、上記した場合と同様に、音声解析部２２による発話情報の解析結果と、対話行動ＤＢ１９に格納された情報から、発話表現に対してロボット１が実行すべき行動を特定し、特定した行動をロボット１に実行させる。なお、本実施形態では、発話情報の内容が「同意要求表現」であることが特定された場合、ロボット１は「２回頷く」動作を実行する。 On the other hand, when the sound at the end of the utterance information or the sound near the end of the utterance information is not a word associated with the “conclusive expression” (No in step 3), but is a word associated with the “consent request expression” ( When the answer is Yes in Step 5), the CPU 15 performs the consent handling operation by the dialogue action executing unit 23 (Step 6). That is, in the same manner as described above, the action to be performed by the robot 1 on the utterance expression is specified from the analysis result of the utterance information by the voice analysis unit 22 and the information stored in the dialogue action DB 19, and the specified action is specified. Is executed by the robot 1. In the present embodiment, when it is specified that the content of the utterance information is “consent request expression”, the robot 1 performs a “twisting twice” operation.

またさらに、発話情報の語尾の音、又は語尾近傍の音が、「断定表現」に関連づけられた語でなく、「同意要求表現」に関連づけられた語でもなかった場合（ステップ３、ステップ５で共にＮｏであった場合）、ＣＰＵ１５は、対話行動実行部２３による通常対応動作を実施する（ステップ７）。即ち、音声解析部２２による発話情報の解析の結果、発話情報の内容が特定の動作を実行する必要のない通常のものであるとされた場合、発話情報の内容が「通常表現」であることが特定される。そして、特定された表現の種類と、対話行動ＤＢ１９に格納された情報から、発話表現に対してロボット１が実行すべき行動を特定し、特定した行動をロボット１に実行させる。つまり、ロボット１に通常の発話に対する行動を実行させる。なお、本実施形態では、発話情報の内容が「通常表現」であることが特定された場合、ロボット１は「１回頷く」動作を実行する。 Furthermore, when the sound at the end of the utterance information or the sound near the end of the utterance information is not a word associated with the “definite expression”, but is not a word associated with the “consent request expression” (in steps 3 and 5). When both are No), the CPU 15 performs the normal response operation by the dialogue action execution unit 23 (step 7). That is, as a result of the analysis of the utterance information by the voice analysis unit 22, if the utterance information content is determined to be a normal one that does not need to execute a specific operation, the utterance information content is “normal expression”. Is identified. And the action which the robot 1 should perform with respect to the utterance expression is specified from the type of the specified expression and the information stored in the dialogue action DB 19, and the robot 1 is caused to execute the specified action. That is, the robot 1 is caused to execute an action for a normal utterance. In this embodiment, when it is specified that the content of the utterance information is “normal expression”, the robot 1 performs the “single once” operation.

このように、使用者が発話している間、ロボット１は、上記したステップ１からステップ７の動作を実行する。 Thus, while the user is speaking, the robot 1 performs the operations from Step 1 to Step 7 described above.

このようなコミュニケーション動作をロボット１が実行することにより、使用者とロボット１との間により自然なコミュニケーションが成立する。即ち、使用者は、機械に話しかけているという違和感が払拭され、相手（ロボット１）が自身の発言を傾聴しているという態度を視認できる。このため、使用者は、あたかもロボット１が自身の発話内容に共感しているように感じ、人に愚痴を聞いてもらったときと同様の状態となる。このことにより、使用者のストレスを解消できるという効果を奏することができる。 When the robot 1 executes such a communication operation, natural communication is established between the user and the robot 1. That is, the user can see the discomfort of talking to the machine, and can visually recognize the attitude that the partner (robot 1) is listening to his / her speech. For this reason, the user feels as if the robot 1 is sympathetic with the content of his / her utterance, and is in the same state as when he / she asks the person to listen to the bitches. As a result, an effect that the stress of the user can be eliminated can be achieved.

以上でコミュニケーション動作についての説明を終了する。 This is the end of the description of the communication operation.

上記した実施形態の対話内容ＤＢ１８，対話行動ＤＢ１９の関連付け（テーブル構成）はあくまで一例であり、データベースに格納する情報の関連付け（テーブル構成）は、本発明の技術的思想を実現できる構成であれば、どのようなものでも構わない。 The association (table configuration) between the dialogue content DB 18 and the dialogue action DB 19 in the above embodiment is merely an example, and the association (table configuration) of information stored in the database is a configuration that can realize the technical idea of the present invention. Anything can be used.

したがって、データベースの数も２つに限るものではない。１つであっても、３つ以上の複数であってもよい。即ち、発話情報の末尾部分を構成する語、表現の種類、ロボットの行動といった、本発明の技術的思想を実現するために必要な情報が、関連づけられて記憶されていればよい。 Therefore, the number of databases is not limited to two. There may be one or a plurality of three or more. That is, information necessary for realizing the technical idea of the present invention, such as the word constituting the tail part of the speech information, the type of expression, and the behavior of the robot, may be stored in association with each other.

また、上記した実施形態では、語尾の音、又は語尾近傍の音と、表現の種類を関連づけて記憶させたが、本発明はこれに限るものではない。例えば、発話情報の長さ、声の強度、終助詞の少なくともいずれかと表現の種類とを関連づけて記憶させてもよい。即ち、発話情報を表す信号と記憶された情報から、発話情報の表現の種類が特定できればよい。 In the above-described embodiment, the sound at the end of the word or the sound near the end of the word and the expression type are stored in association with each other. However, the present invention is not limited to this. For example, the length of speech information, the strength of voice, and / or the final particle may be stored in association with the type of expression. That is, it suffices if the type of expression of the speech information can be specified from the signal representing the speech information and the stored information.

上記した実施形態では、使用者が日本語の標準語でロボット１に話しかけることを想定してデータを構築したが、本発明はこれに限るものではない。例えば、表現の種類と関連付けられる語尾の音、又は語尾近傍の音は、外国語を想定したものであってよく、方言を想定したものであってもよい。即ち、本発明のロボット１は、日本語、外国語、又はそれらの方言のいずれか、又はすべてに対応するものであってよい。 In the above-described embodiment, the data is constructed assuming that the user speaks to the robot 1 using a Japanese standard language, but the present invention is not limited to this. For example, the sound at the end of the word associated with the type of expression or the sound near the end of the word may be assumed to be a foreign language, or may be assumed to be a dialect. That is, the robot 1 of the present invention may correspond to any or all of Japanese, foreign languages, or dialects thereof.

上記した実施形態では、ロボット１が使用者の発話内容に共感していることを、使用者自身に感じてもらうため、ロボット１が「断定表現」，「同意要求表現」，「通常表現」の順に強く同意する動作を実施した。即ち、「断定表現」，「同意要求表現」，「通常表現」の順に頷き（応答反応）の回数を多くして、同意の強弱を異ならしめた。しかしながら、本発明のロボットが実施するコミュニケーション動作は、これに限るものではない。例えば、「断定表現」、「同意要求表現」、「通常表現」のそれぞれに対して、異なる動作を実行して応答反応を異なるものとしてもよい。即ち、取得した発話情報と、予め記憶された基準となる情報に基づいて、必要な同意の強さを判定し、判定した結果に応じた応答反応が表出されればよい。
したがって、応答反応として実行する動作は、頷くといった頭部３を動かす動作だけでなく、手部４や足部５を動かす動作を実施してもよい。しかしながら、発話に対して頷く等の動作、即ち、人が他人の話を聴いたときに実際に取り得る動作をロボット１に実行させることが、使用者とロボットとのコミュニケーションがより自然となるため、望ましい。 In the above-described embodiment, in order for the user himself / herself to feel that the robot 1 is sympathetic with the content of the user's utterance, the robot 1 is in the “definite expression”, “consent request expression”, and “normal expression”. We performed actions that strongly agreed in order. In other words, the strength of consent was made different by increasing the number of responses (response reactions) in the order of “conclusive expression”, “consent request expression”, and “normal expression”. However, the communication operation performed by the robot of the present invention is not limited to this. For example, different responses may be executed by executing different operations for each of “conclusive expression”, “consent request expression”, and “normal expression”. That is, it is only necessary to determine the strength of the necessary consent based on the acquired utterance information and reference information stored in advance, and to display a response response according to the determined result.
Therefore, the operation to be executed as a response reaction is not limited to the operation of moving the head 3 such as scooping, but may be the operation of moving the hand 4 or the foot 5. However, it is more natural for the communication between the user and the robot to cause the robot 1 to execute an action such as whispering to an utterance, that is, an action that can be actually taken when a person listens to another person's story. ,desirable.

上記した実施形態では、ロボット１は人を模したものであったが、本発明はこれに限るものではない。例えば、図４で示されるように、より無機的な形状を有するロボット５０であってもよい。即ち、頭部の形状は丸みを帯びた形状だけでなく、方形であってもよい。また、手部の先端の形状は略Ｃ字状の形状であってもよい。また、図５で示されるような、犬等の動物を模したロボット５１であってもよい。この場合、応答動作は、鳴く、吠える、四肢を動かす、尻尾を動かす、耳を動かす等の動作であってよい。より具体的には、映画、小説、漫画等の創作物等の実在しない登場人物、対称物を擬人化したものであってもよい。しかし、人を模した形状であれば、使用者が違和感なく話しかけることができる効果がより高くなることが予測されるため、望ましい。 In the above-described embodiment, the robot 1 imitates a person, but the present invention is not limited to this. For example, as shown in FIG. 4, a robot 50 having a more inorganic shape may be used. That is, the shape of the head is not limited to a rounded shape, but may be a square. Further, the shape of the tip of the hand portion may be a substantially C-shape. Moreover, the robot 51 imitating animals, such as a dog, as shown in FIG. 5 may be used. In this case, the response operation may be an operation such as ringing, barking, moving the extremity, moving the tail, moving the ear, and the like. More specifically, it may be an anthropomorphic person who does not exist, such as a movie, a novel, or a creation such as a comic, or a symmetric object. However, a shape imitating a person is desirable because it is predicted that the effect of allowing the user to talk without a sense of incongruity will be higher.

１，５０，５１ロボット（コミュニケーションロボット）
３頭部（反応表出手段）
１７マイク（音声検知手段）
２２音声解析部（音声判定手段） 1,50,51 Robot (communication robot)
3 head (reaction expression means)
17 Microphone (voice detection means)
22 Voice analysis unit (voice judgment means)

Claims

It has voice detection means for detecting spoken language spoken by the user, and reaction expression means for expressing a predetermined response response that the user can know visually and / or auditorily, and reacts to the user's spoken language. The communication robot is a communication robot that displays a plurality of patterns of response responses.
Having a voice determination means for determining the content of the spoken word detected by the voice detection means;
The voice determination means is at least one of determination of whether or not the spoken word is a normal expression, determination of whether or not the spoken word is an agreement request expression, and determination of whether or not the spoken word is an asserted expression Judgment is possible,
A communication robot characterized in that different response responses are expressed based on the determination result of the voice determination means.

It has voice detection means for detecting spoken language spoken by the user, and reaction expression means for expressing a predetermined response response that the user can know visually and / or auditorily, and reacts to the user's spoken language. The communication robot is a communication robot that displays a plurality of patterns of response responses.
Having a voice determination means for determining the content of the spoken word detected by the voice detection means;
The voice determination means determines what the content of the spoken word is based on at least one of the information of the sound at the end of the spoken word or the sound near the word ending, the length of the spoken word, or the strength of the voice. Judgment,
A communication robot characterized in that different response responses are expressed based on the determination result of the voice determination means.

The communication robot according to claim 1, wherein the voice determination unit determines the content of the spoken word based on a final particle of the spoken word uttered by the user.

The response response is a different response response by changing the number of executions of the specific operation, and the number of times of executing the specific operation changes based on the determination result of the voice determination means. The communication robot according to any one of claims 1 to 3.

The voice determination means determines that the utterance of the spoken word uttered by the user is an agreement request expression when it is one or more words selected from the following words (1) to (4): The communication robot according to any one of claims 1 to 4.
(1) "Ne"
(2) "U"
(3) "O"
(4) "ka"

The voice determination means determines that the spoken word uttered by the user is an affirmative expression when the ending of the spoken word is one or two words selected from the following words (5) and (6): Item 6. The communication robot according to any one of Items 1 to 5.
(5) "Da"
(6) "Ya"

The communication robot according to claim 1, wherein the communication robot has a shape imitating a person or an animal.

The communication robot according to claim 7, wherein the response reaction is a combination of one or two or more operations selected from the following operations (1) to (3).
(1) whisper (2) cry (3) move one of the four legs

The voice determination means determines that the utterance of the spoken word uttered by the user is one or more words selected from the following (1) to (4) as an agreement request expression:
(1) "Ne"
(2) "U"
(3) "O"
(4) "ka"
When the ending of the spoken language given by the user is one or two words selected from the following words (5) and (6), it is determined to be a definitive expression,
(5) "Da"
(6) "Ya"
If the spoken language spoken by the user is not a consent request expression or assertive expression, it is determined as a normal expression,
The response response is a different response response by changing the number of times the crawl operation is performed,
The number of movements performed when determined as the consent request expression is greater than the number of movements performed when determined as the normal expression,
9. The communication robot according to claim 8, wherein the number of movements performed when it is determined as the assertion expression is greater than the number of movements performed when it is determined as the consent request expression.