JP7576290B1

JP7576290B1 - TOPIC MODULE SET CREATION DEVICE, INTERACTION DEVICE, TOPIC MODULE SET CREATION METHOD, INTERACTION METHOD, AND COMPUTER PROGRAM

Info

Publication number: JP7576290B1
Application number: JP2023095767A
Authority: JP
Inventors: 惇馬場; 智久挾間; 惇也中西; 雄一郎吉川; 浩石黒
Original assignee: Osaka University NUC; CyberAgent Inc
Current assignee: Osaka University NUC; CyberAgent Inc
Priority date: 2023-06-09
Filing date: 2023-06-09
Publication date: 2024-10-31
Anticipated expiration: 2043-06-09
Also published as: JP2024176904A

Abstract

To continue a conversation even when complex topic transition patterns cannot be predicted.
[Solution] A dialogue device comprising: a topic module set creation unit that creates a topic module set including multiple topic modules indicating topics to be provided to a user who is the dialogue partner when certain conditions are satisfied, based on one or more scenarios that describe a one-way dialogue flow; a topic determination unit that determines a topic according to the user's status or the system based on the topic module set created by the topic module set creation unit and status information that represents at least the status of the user or the system; and an output unit that outputs content according to the determined topic.
[Selected Figure] Figure 1

Description

特許法第３０条第２項適用発行日（公開日）２０２３年５月３日刊行物名日本ロボット学会誌４１巻３号２０２３年４月号Ｐ．２９１～３０２一般社団法人日本ロボット学会〔Ｗｅｂ公開〕ＵＲＬ：ｈｔｔｐｓ：／／ｗｗｗ．ｊｓｔａｇｅ．ｊｓｔ．ｇｏ．ｊｐ／ａｒｔｉｃｌｅ／ｊｒｓｊ／４１／３／４１＿４１＿２９１／＿ａｒｔｉｃｌｅ／－ｃｈａｒ／ｊａＤＯＩ：ｈｔｔｐｓ：／／ｄｏｉ．ｏｒｇ／１０．７２１０／ｊｒｓｊ．４１．２９１＜資料＞日本ロボット学会誌掲載学術・技術論文Article 30, paragraph 2 of the Patent Act applies. Publication date (publication date) May 3, 2023 Publication name Journal of the Robotics Society of Japan Vol. 41, No. 3 April 2023, pp. 291-302 Robotics Society of Japan [Web publication] URL: https://www.jstage.jst.go.jp/article/jrsj/41/3/41_41_291/_article/-char/ja DOI: https://doi.org/10.7210/jrsj.41.291 <Materials> Journal of the Robotics Society of Japan Academic and technical papers published

本発明は、話題モジュールセット作成装置、対話装置、話題モジュールセット作成方法、対話方法及びコンピュータプログラムに関する。 The present invention relates to a topic module set creation device, a dialogue device, a topic module set creation method, a dialogue method, and a computer program.

従来、人と対話ロボットによる対話システムに関する研究がなされている（例えば、非特許文献１参照）。対話システムの実装アプローチとして、機械学習を用いる方法や、有限状態マシンを用いる方法等が挙げられる。機械学習を用いる方法では、学習したパターンでシステムを制御でき、人の発話に応じて何らかの応答を返すことができるため対話を続けることができる。一方で、機械学習を用いる方法では、学習パターンに依存してしまうため、設計者の意図した対話ができない場合もある。さらに、特定の振る舞いを修正したい場合であっても簡単に修正することができない。 Conventionally, research has been conducted on dialogue systems between humans and dialogue robots (see, for example, Non-Patent Document 1). Approaches to implementing dialogue systems include methods using machine learning and methods using finite state machines. Methods using machine learning can control the system using learned patterns and can return some kind of response in response to human speech, allowing for continued dialogue. On the other hand, methods using machine learning rely on learned patterns, and therefore may not be able to carry out the dialogue intended by the designer. Furthermore, even if a specific behavior needs to be modified, it cannot be easily done.

それに対して、有限状態マシンを用いる方法では、設計者が手動でルールを記述するため、設計者の意図した対話を実現しやすい。さらに、特定の振る舞いを修正したい場合には、その特定の振る舞いに関するルールを修正するだけでよい。そのため、有限状態マシンを用いる方法もよく利用されている。 In contrast, with the method using finite state machines, the designer writes the rules manually, making it easier to realize the interactions the designer intended. Furthermore, if a specific behavior needs to be modified, it is only necessary to modify the rules related to that specific behavior. For this reason, the method using finite state machines is also widely used.

駒谷和範, “音声対話システムの構成と今後”, ［online］, [令和５年６月９日検索], インターネット<URL: https://system.jpaa.or.jp/patent/viewPdf/3307>, Vol. 72 No. 8Kazunori Komatani, “Configuration and Future of Speech Dialogue Systems”, [online], [Retrieved June 9, 2023], Internet <URL: https://system.jpaa.or.jp/patent/viewPdf/3307>, Vol. 72 No. 8

有限状態マシンを用いる方法では、話題の遷移の前後関係を全て設計する必要がある。そのため、想定する遷移パターンの増大に伴い、設計が複雑になってしまうことが考えられる。実際の運用上、遷移パターンを全て網羅するように設計することは困難であり、このような場合、未設計の箇所においては話題遷移ができずに対話が破綻してしまうという問題があった。このような問題は、人と対話ロボットとの発話による対話に限らず、テキストによる対話においても共通する問題である。 When using a finite state machine, it is necessary to design all contexts for topic transitions. As a result, the design can become complicated as the number of anticipated transition patterns increases. In actual operation, it is difficult to design a system that covers all transition patterns, and in such cases, there is a problem that topic transitions cannot be made in undesigned areas, causing the dialogue to break down. This type of problem is not limited to spoken dialogue between humans and conversational robots, but is also common in text-based dialogue.

上記事情に鑑み、本発明は、複雑な話題の遷移パターンを想定しきれていない場合であっても対話を継続させることができる技術の提供を目的としている。 In light of the above, the present invention aims to provide technology that can continue a conversation even when complex topic transition patterns cannot be fully anticipated.

本発明の一態様は、一方向の対話の流れが記述された１以上のシナリオに基づいて、所定の条件が満たされた場合に対話の相手であるユーザに提供する話題が示された話題モジュールを複数含む話題モジュールセットを作成する話題モジュールセット作成部、を備え、少なくとも１つ以上の話題モジュールには、前記ユーザ又はシステムの状態を表すための候補状態と変数の組み合わせを少なくとも１つ含む起動条件が設定され、前記話題モジュールセット作成部は、話題モジュールに示された前記話題に応じた内容を各話題モジュールに対応付けることによって前記候補状態と前記話題のセットである話題モジュールを複数作成し、作成した複数の前記話題モジュールを所定の優先順位で階層構造に配置することによって前記話題モジュールセットを作成する、話題モジュールセット作成装置である。 One aspect of the present invention is a topic module set creation device that includes a topic module set creation unit that creates a topic module set including a plurality of topic modules indicating topics to be provided to a user who is the other party of a conversation when certain conditions are satisfied, based on one or more scenarios that describe a one-way flow of dialogue, and at least one or more topic modules are set with an activation condition including at least one combination of candidate states and variables for representing the state of the user or system, and the topic module set creation unit creates a plurality of topic modules that are sets of the candidate states and the topics by corresponding each topic module to content corresponding to the topic indicated in the topic module, and creates the topic module set by arranging the created plurality of topic modules in a hierarchical structure in a predetermined priority order .

本発明の一態様は、上記の話題モジュールセット作成装置と、前記話題モジュールセット作成装置によって作成された前記話題モジュールセットと、少なくとも前記ユーザ又はシステムの状態を表す状態情報とに基づいて、前記ユーザの状態又は前記システムの状態に応じた話題を決定する話題決定部と、決定された前記話題に応じた内容を出力させる出力部と、を備える対話装置である。 One aspect of the present invention is an interactive device comprising the above-mentioned topic module set creation device , a topic determination unit that determines a topic corresponding to the state of the user or the state of the system based on the topic module set created by the topic module set creation device and state information representing at least the state of the user or the system, and an output unit that outputs content corresponding to the determined topic.

本発明の一態様は、上記の対話装置であって、少なくとも１つ以上の話題モジュールには、前記ユーザ又は前記システムの状態を表すための候補状態と変数の組み合わせを少なくとも１つ含む起動条件が設定され、前記話題モジュールセット作成部は、話題モジュールに示された前記話題に応じた内容を各話題モジュールに対応付けて、所定の優先順位で複数の話題モジュールを階層構造に配置することによって前記話題モジュールセットを作成する。 One aspect of the present invention is the dialogue device described above, in which at least one or more topic modules are set with a start condition including at least one combination of a candidate state and a variable for expressing the state of the user or the system, and the topic module set creation unit creates the topic module set by associating content according to the topic indicated in the topic module with each topic module and arranging multiple topic modules in a hierarchical structure in a predetermined priority order.

本発明の一態様は、上記の対話装置であって、１つのシナリオは、複数の話題で構成され、前記シナリオを構成する各話題を前記候補状態とし、各候補状態に変数を対応付けることによって前記状態情報を作成する状態情報作成部、をさらに備える。 In one aspect of the present invention, the dialogue device further includes a state information creation unit that creates the state information by associating a variable with each candidate state, with one scenario being composed of multiple topics, and with each topic constituting the scenario being treated as a candidate state.

本発明の一態様は、上記の対話装置であって、前記状態情報作成部は、前記状態情報を作成する際に、前記シナリオを構成する各話題のうち、同様の意味を示す複数の話題を１つの話題に統合する。 One aspect of the present invention is the dialogue device described above, in which the state information creation unit, when creating the state information, integrates multiple topics that have similar meanings among the topics that make up the scenario into one topic.

本発明の一態様は、上記の対話装置であって、少なくとも前記ユーザの発話内容又はテキストにより入力された内容に応じて前記ユーザに関する状態情報を更新し、前記システムにおける発話内容又は動作に応じて前記システムに関する状態情報を更新する状態更新部をさらに備え、前記状態更新部は、前記ユーザの発話内容もしくはテキストにより入力された内容又は前記システムにおける発話内容もしくは動作に応じた候補状態の変数を更新することによって前記状態情報を更新する。 One aspect of the present invention is the above-mentioned dialogue device, further comprising a state update unit that updates state information related to the user in response to at least the content of the user's utterance or the content input by text, and updates state information related to the system in response to the content of the utterance or an action in the system, and the state update unit updates the state information by updating variables of candidate states in response to the content of the user's utterance or the content input by text, or the content of the utterance or an action in the system.

本発明の一態様は、上記の対話装置であって、前記話題決定部は、前記状態情報に基づいて、満たされた起動条件に対応付けられた話題のうち、優先順位の高い話題を前記ユーザ又は前記システムの状態に応じた話題として決定する。 One aspect of the present invention is the dialogue device described above, in which the topic determination unit determines, based on the state information, a topic with a high priority among the topics associated with the satisfied activation conditions as a topic corresponding to the state of the user or the system.

本発明の一態様は、上記の対話装置であって、前記話題決定部は、前記起動条件を満たす話題モジュールがない場合に、前記起動条件が設定されていない対話モジュールを選択する。 In one aspect of the present invention, in the dialogue device described above, when there is no topic module that satisfies the activation condition, the topic determination unit selects a dialogue module for which the activation condition is not set.

本発明の一態様は、上記の対話装置であって、前記話題決定部により決定された話題又は話題に基づく情報に基づいて、自装置に接続されたロボット、表示装置に表示されたエージェント又は音声出力装置の動作を制御する動作制御部をさらに備える。 One aspect of the present invention is the dialogue device described above, further comprising an operation control unit that controls the operation of a robot connected to the device, an agent displayed on a display device, or a voice output device based on the topic or information based on the topic determined by the topic determination unit.

本発明の一態様は、コンピュータが、一方向の対話の流れが記述された１以上のシナリオに基づいて、所定の条件が満たされた場合に対話の相手であるユーザに提供する話題が示された話題モジュールを複数含む話題モジュールセットを作成し、少なくとも１つ以上の話題モジュールには、前記ユーザ又はシステムの状態を表すための候補状態と変数の組み合わせを少なくとも１つ含む起動条件が設定され、話題モジュールに示された前記話題に応じた内容を各話題モジュールに対応付けることによって前記候補状態と前記話題のセットである話題モジュールを複数作成し、作成した複数の前記話題モジュールを所定の優先順位で階層構造に配置することによって前記話題モジュールセットを作成する、話題モジュールセット作成方法である。 One aspect of the present invention is a topic module set creation method in which a computer creates a topic module set including a plurality of topic modules indicating topics to be provided to a user who is the other party in a conversation when a certain condition is satisfied, based on one or more scenarios describing a one-way dialogue flow, at least one or more topic modules are set with an activation condition including at least one combination of candidate states and variables for representing the state of the user or system, multiple topic modules which are sets of the candidate states and the topics are created by matching content corresponding to the topics indicated in the topic modules with each topic module, and the topic module set is created by arranging the created plurality of topic modules in a hierarchical structure in a certain priority order .

本発明の一態様は、コンピュータが、上記の話題モジュールセット作成方法によって作成された前記話題モジュールセットと、少なくとも前記ユーザ又はシステムの状態を表す状態情報とに基づいて、前記ユーザ又は前記システムの状態に応じた話題を決定し、決定された前記話題に応じた内容を出力させる、対話方法である。 One aspect of the present invention is a dialogue method in which a computer determines a topic corresponding to the state of the user or the system based on the topic module set created by the above-mentioned topic module set creation method and status information representing at least the state of the user or the system, and outputs content corresponding to the determined topic.

本発明の一態様は、上記の話題モジュールセット作成装置、または、上記の対話装置として機能させるためのコンピュータプログラムである。 One aspect of the present invention is a computer program for causing the topic module set creation device or the dialogue device to function as described above.

本発明により、複雑な話題の遷移パターンを想定しきれていない場合であっても対話を継続させることが可能となる。 This invention makes it possible to continue a conversation even when complex topic transition patterns cannot be fully anticipated.

実施形態における対話システムの構成の一例を示す図である。FIG. 1 is a diagram illustrating an example of a configuration of a dialogue system according to an embodiment. 実施形態における状態情報の一例を示す図である。FIG. 4 is a diagram illustrating an example of state information according to an embodiment. 実施形態における話題モジュールセットの一例（その１）を示す図である。FIG. 1 is a diagram showing an example (part 1) of a topic module set in an embodiment. 実施形態における出力言語情報の一例を示す図である。FIG. 4 is a diagram illustrating an example of output language information in the embodiment. 実施形態におけるシナリオの一例を示す図である。FIG. 2 is a diagram illustrating an example of a scenario in an embodiment. 実施形態における状態情報の作成方法を説明するための図である。FIG. 11 is a diagram for explaining a method for creating state information in the embodiment. 実施形態における状態情報の作成方法を説明するための図である。FIG. 11 is a diagram for explaining a method for creating state information in the embodiment. 実施形態における話題モジュールセットの作成方法を説明するための図である。11 is a diagram for explaining a method for creating a topic module set in an embodiment. FIG. 実施形態における話題モジュールセットの作成方法を説明するための図である。11 is a diagram for explaining a method for creating a topic module set in an embodiment. FIG. 実施形態における話題モジュールセットの作成方法を説明するための図である。11 is a diagram for explaining a method for creating a topic module set in an embodiment. FIG. 実施形態における話題モジュールセットの作成方法を説明するための図である。11 is a diagram for explaining a method for creating a topic module set in an embodiment. FIG. 実施形態における話題モジュールセットの作成方法を説明するための図である。11 is a diagram for explaining a method for creating a topic module set in an embodiment. FIG. 実施形態における話題モジュールセットの作成方法を説明するための図である。11 is a diagram for explaining a method for creating a topic module set in an embodiment. FIG. 実施形態における話題モジュールセットの作成方法を説明するための図である。11 is a diagram for explaining a method for creating a topic module set in an embodiment. FIG. 実施形態における話題モジュールセットの作成方法を説明するための図である。11 is a diagram for explaining a method for creating a topic module set in an embodiment. FIG. 実施形態における話題モジュールセットの作成方法を説明するための図である。11 is a diagram for explaining a method for creating a topic module set in an embodiment. FIG. 実施形態における話題モジュールセットの作成方法を説明するための図である。11 is a diagram for explaining a method for creating a topic module set in an embodiment. FIG. 実施形態における話題モジュールセットの作成方法を説明するための図である。11 is a diagram for explaining a method for creating a topic module set in an embodiment. FIG. 実施形態における話題モジュールセットの作成方法を説明するための図である。11 is a diagram for explaining a method for creating a topic module set in an embodiment. FIG. 実施形態における対話システムの処理の流れを示すシーケンス図（その１）である。FIG. 1 is a sequence diagram (part 1) showing a process flow of the dialogue system in the embodiment. 実施形態における状態情報の更新の一例を示す図である。FIG. 11 is a diagram illustrating an example of updating state information in the embodiment. 実施形態における話題モジュールセットの一例（その２）を示す図である。FIG. 13 is a diagram showing an example (part 2) of a topic module set in the embodiment. 実施形態における対話システムの処理の流れを示すシーケンス図（その１）である。FIG. 1 is a sequence diagram (part 1) showing a process flow of the dialogue system in the embodiment. 実施形態における状態情報の更新の一例を示す図である。FIG. 11 is a diagram illustrating an example of updating state information in the embodiment. 実施形態における状態情報の更新の一例を示す図である。FIG. 11 is a diagram illustrating an example of updating state information in the embodiment. 実施形態における話題モジュールセットの一例（その３）を示す図である。FIG. 11 is a diagram showing an example (part 3) of a topic module set in the embodiment. 実施形態における対話システムの処理の流れを示すシーケンス図（その３）である。FIG. 3 is a sequence diagram (part 3) showing the processing flow of the dialogue system in the embodiment. 実施形態における状態情報の更新の一例を示す図である。FIG. 11 is a diagram illustrating an example of updating state information in the embodiment. 実施形態における話題モジュールセットの一例（その４）を示す図である。FIG. 11 is a diagram showing an example (part 4) of a topic module set in the embodiment. 実施形態における対話システムの処理の流れを示すシーケンス図（その４）である。FIG. 4 is a sequence diagram showing the processing flow of the dialogue system in the embodiment (part 4). 実施形態における状態情報の更新の一例を示す図である。FIG. 11 is a diagram illustrating an example of updating state information in the embodiment. 変形例における対話システムの構成の一例を示す図である。FIG. 13 is a diagram illustrating an example of a configuration of a dialogue system according to a modified example. 変形例における話題モジュールセットの一例（その５）を示す図である。FIG. 5 is a diagram showing an example (part 5) of a topic module set in a modified example.

以下、本発明の一実施形態を、図面を参照しながら説明する。
（要約）
本発明における対話装置では、所定の条件が満たされた場合に、対話の相手であるユーザに提供する話題が示された話題モジュールを複数含む話題モジュールセットと、ユーザの動作、ユーザの発話内容又はテキストにより入力された内容に応じたユーザの状態を表す状態情報とに基づいて、ユーザに対して提供する話題を決定し、決定した話題に基づく内容をユーザに出力（音声出力又はテキスト出力）する。対話の相手であるユーザに提供する話題は、ユーザと対話するための材料であり、どのような事柄であってもよい。話題モジュールセットは、複数の話題モジュールが、優先順位に応じて階層構造で配置されて構成される。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings.
(summary)
In the dialogue device of the present invention, when a predetermined condition is satisfied, a topic to be provided to a user is determined based on a topic module set including multiple topic modules indicating a topic to be provided to the user, who is the dialogue partner, and status information indicating the user's status according to the user's actions, the user's speech content, or the content input by text, and content based on the determined topic is output to the user (audio output or text output). The topic to be provided to the user, who is the dialogue partner, is material for dialogue with the user, and may be any matter. The topic module set is composed of multiple topic modules arranged in a hierarchical structure according to priority order.

本発明における対話には、ユーザと対話装置との音声による対話と、ユーザと対話装置とのテキストによる対話と、ユーザ又は対話装置が音声による対話を行い、他方がテキストによる対話を行うことを含む。一実施形態の説明では、ユーザと対話装置との音声による対話を例に説明する。 The dialogue in this invention includes a dialogue by voice between a user and an interactive device, a dialogue by text between a user and an interactive device, and a dialogue by voice in which the user or the interactive device dialogues and the other dialogues by text. In the explanation of one embodiment, a dialogue by voice between a user and an interactive device will be used as an example.

さらに、本発明における対話装置では、一方向の対話の流れが記述された１以上のシナリオに基づいて、話題モジュールセット及び状態情報を作成する。ここで一方向の対話の流れとは、ある対象が、対話の相手との対話で達成したいゴールに向かうまでにやり取りされることが想定される話題の流れを表す。例えば、対話の相手との対話で達成したいゴールが、「ラーメン店を推薦する」であるとする。この場合、ゴールに向かうためにやり取りされることが想定される一方向の対話の話題として、「お腹空いている？」、「ラーメン食べたい？」、「ラーメン店紹介ＯＫ？」等が考えられる。一方向の対話の流れとしては、「お腹空いている？」と聞いた後に、「ラーメン食べたい？」と聞き、その後に「ラーメン店紹介ＯＫ？」と聞いて「ラーメン店を推薦する」といった流れが考えられる。 Furthermore, the dialogue device of the present invention creates a topic module set and state information based on one or more scenarios in which a one-way dialogue flow is described. Here, the one-way dialogue flow refers to a flow of topics that is expected to be exchanged until a certain subject reaches a goal that the subject wishes to achieve in a dialogue with the dialogue partner. For example, assume that the goal that the subject wishes to achieve in a dialogue with the dialogue partner is "recommend a ramen shop." In this case, possible one-way dialogue topics that are expected to be exchanged to reach the goal include "Are you hungry?", "Want to eat ramen?", "Can I recommend a ramen shop?", etc. A one-way dialogue flow could be asking "Are you hungry?" followed by "Want to eat ramen?", followed by asking "Can I recommend a ramen shop?", followed by "I will recommend a ramen shop."

本発明における対話装置では、上述したような一方向の対話の流れが記述されたシナリオを１つ以上用いて、話題モジュールセット及び状態情報を作成する。そして、対話装置は、作成した話題モジュールセット及び状態情報を用いて、ユーザに対して提供する話題を決定し、決定した話題に基づく内容をユーザに出力（音声出力又はテキスト出力）する。 The dialogue device of the present invention creates a topic module set and state information using one or more scenarios that describe the one-way dialogue flow as described above. The dialogue device then uses the created topic module set and state information to determine a topic to be provided to the user, and outputs content based on the determined topic to the user (audio output or text output).

話題モジュールセットは、移動ロボットの分野で利用されるサブサンプションアーキテクチャの技術を元に作成されるものである。本発明における話題モジュールセットが、従来のサブサンプションアーキテクチャの技術で作成されるモジュールセットと異なる点は、シナリオに基づいて自動的に話題モジュールセットを作成できる点と、ユーザの状態が加味されている点である。例えば、移動ロボットの分野で利用されるサブサンプションアーキテクチャでは、予め作成された各モジュールを所定の優先順位に応じて並べることによってモジュールセットを構成している。それに対して、本発明における対話システムでは、一方向の対話の流れが記述されたシナリオに基づいて自動的に話題モジュールセットを作成することができる。 The topic module set is created based on the subsumption architecture technology used in the field of mobile robots. The topic module set in the present invention differs from module sets created using conventional subsumption architecture technology in that the topic module set can be created automatically based on a scenario and that the user's state is taken into account. For example, in the subsumption architecture used in the field of mobile robots, a module set is constructed by arranging each pre-created module according to a predetermined priority. In contrast, the dialogue system in the present invention can automatically create a topic module set based on a scenario that describes a one-way dialogue flow.

さらに、移動ロボットの分野で利用されるサブサンプションアーキテクチャでは、モジュールセットを構成する各モジュールに入力された情報（例えば、センサの出力）のみで起動条件が満たされたか否かが判定され、条件が満たされた振る舞いの情報が出力される。それに対して、本発明における対話システムでは、話題モジュールセットを構成する各話題モジュールに入力された情報（例えば、ユーザの発話内容に応じたユーザの状態＝現在の状態）と、その前までに行われたユーザとの対話の履歴情報（例えば、過去の対話におけるユーザの状態＝過去の状態）等のユーザの状態を表すための複数の候補となる状態に基づいて所定の条件を満たすか否かが判定され、所定の条件が満たされた振る舞いの情報が出力される。このように、本発明における対話システムでは、過去の対話の流れからユーザに対して次に提供する話題を決定するため、複雑な話題の遷移パターンを想定しきれていない場合であっても対話を継続させることが可能となる。
以下、各実施形態について詳細に説明する。 Furthermore, in the subsumption architecture used in the field of mobile robots, whether or not the activation condition is satisfied is determined based only on the information (e.g., sensor output) input to each module constituting a module set, and information on behavior when the condition is satisfied is output. In contrast, in the dialogue system of the present invention, whether or not a predetermined condition is satisfied is determined based on a plurality of candidate states for representing the user's state, such as information input to each topic module constituting a topic module set (e.g., the user's state according to the content of the user's utterance = current state) and history information of the dialogue with the user conducted up to that point (e.g., the user's state in the past dialogue = past state), and information on behavior when the predetermined condition is satisfied is output. In this way, in the dialogue system of the present invention, the next topic to be provided to the user is determined from the flow of the past dialogue, so that it is possible to continue the dialogue even when complex topic transition patterns cannot be predicted.
Each embodiment will be described in detail below.

（実施形態）
［対話システム１００の構成］
図１は、実施形態における対話システム１００の構成の一例を示す図である。対話システム１００は、対話装置１０と、カメラ２０と、マイク３０と、スピーカー４０と、表示装置５０とを備える。カメラ２０と、マイク３０と、スピーカー４０と、表示装置５０とは、有線又は無線により対話装置１０に接続される。 (Embodiment)
[Configuration of dialogue system 100]
1 is a diagram showing an example of a configuration of a dialogue system 100 according to an embodiment. The dialogue system 100 includes a dialogue device 10, a camera 20, a microphone 30, a speaker 40, and a display device 50. The camera 20, the microphone 30, the speaker 40, and the display device 50 are connected to the dialogue device 10 by wire or wirelessly.

対話装置１０は、対話の対象となるユーザに対して提供する話題を決定し、決定した話題に応じた内容を出力させることでユーザとの対話を実現する。例えば、対話装置１０は、決定した話題に応じた内容の音声を出力させることでユーザとの対話を実現する。対話装置１０は、例えばパーソナルコンピュータ等の情報処理装置を用いて構成される。 The dialogue device 10 determines a topic to be provided to the user who is the subject of the dialogue, and realizes a dialogue with the user by outputting content corresponding to the determined topic. For example, the dialogue device 10 realizes a dialogue with the user by outputting a voice having content corresponding to the determined topic. The dialogue device 10 is configured using an information processing device such as a personal computer.

カメラ２０は、対話装置１０の周辺の動画像を撮像する。カメラ２０は、撮像した動画像に応じた映像信号を生成し、映像信号に基づく画像情報を対話装置１０に入力する。 The camera 20 captures moving images of the surroundings of the dialogue device 10. The camera 20 generates a video signal corresponding to the captured moving images, and inputs image information based on the video signal to the dialogue device 10.

マイク３０は、対話装置１０の周辺の音声を収音する。例えば、マイク３０は、対話装置１０に近づいたユーザの音声を取得する。マイク３０は、取得した音声に基づいて音声信号を生成する。マイク３０は、生成した音声信号を対話装置１０に出力する。なお、マイク３０は、対話装置１０の内部に備えられてもよい。 The microphone 30 picks up sounds around the dialogue device 10. For example, the microphone 30 acquires the voice of a user approaching the dialogue device 10. The microphone 30 generates a voice signal based on the acquired voice. The microphone 30 outputs the generated voice signal to the dialogue device 10. Note that the microphone 30 may be provided inside the dialogue device 10.

スピーカー４０は、対話装置１０により生成される音声信号を出力する。例えば、スピーカー４０は、決定された話題による内容を音声出力する。スピーカー４０は、表示装置５０の近傍（例えば、表示装置５０の横や表示装置５０の後方）に備えられる。 The speaker 40 outputs an audio signal generated by the dialogue device 10. For example, the speaker 40 outputs audio content based on a determined topic. The speaker 40 is provided near the display device 50 (for example, to the side of the display device 50 or behind the display device 50).

表示装置５０は、液晶ディスプレイ、有機ＥＬ（Electro Luminescence）ディスプレイ、電子泳動方式ディスプレイ等の画像表示装置である。表示装置５０は、二次元で表現されたエージェントを表示する。二次元で表現されたエージェントは、例えば表示装置５０の画面上に表示されたキャラクタである。なお、スピーカー４０と表示装置５０とは、一体化されていてもよい。 The display device 50 is an image display device such as a liquid crystal display, an organic EL (Electro Luminescence) display, or an electrophoretic display. The display device 50 displays an agent represented in two dimensions. The agent represented in two dimensions is, for example, a character displayed on the screen of the display device 50. The speaker 40 and the display device 50 may be integrated.

次に、対話装置１０の機能構成について説明する。対話装置１０は、記憶部１１と、制御部１２とを備える。記憶部１１には、辞書１１１と、状態情報１１２と、話題モジュールセット１１３と、出力言語情報１１４と、動作制御情報１１５等が記憶される。記憶部１１は、磁気記憶装置や半導体記憶装置などの記憶装置を用いて構成される。 Next, the functional configuration of the dialogue device 10 will be described. The dialogue device 10 includes a memory unit 11 and a control unit 12. The memory unit 11 stores a dictionary 111, state information 112, a topic module set 113, output language information 114, and operation control information 115. The memory unit 11 is configured using a storage device such as a magnetic storage device or a semiconductor storage device.

辞書１１１は、自然言語処理の意味解析に用いられる辞書である。 Dictionary 111 is a dictionary used for semantic analysis in natural language processing.

状態情報１１２は、ユーザの状態を表す情報である。ここでいうユーザとは、対話装置１０と対話を行う対象となるユーザである。図２は、状態情報１１２の一例を示す図である。図２に示すように、状態情報１１２は、ユーザの状態を表すための複数の候補状態と、複数の候補状態それぞれに対応付けられた変数とで構成される。なお、図２に示す例では、候補状態としてユーザの発話に基づいて判断されるユーザの状態を示しているが、候補状態にはカメラ２０により得られる画像から検知されるユーザの動作（例えば、人物が近づいてくる、人物が立ち止まる、人物が手を振る等）に関する状態が含まれてもよい。候補状態は、例えばユーザとの対話で達成したいゴールに向かうまでにやり取りされることが想定される話題でユーザが取り得る状態を表す。 The state information 112 is information that represents the state of the user. The user here is a user who is a target of dialogue with the dialogue device 10. FIG. 2 is a diagram showing an example of the state information 112. As shown in FIG. 2, the state information 112 is composed of a plurality of candidate states for representing the state of the user, and variables associated with each of the plurality of candidate states. Note that in the example shown in FIG. 2, the candidate states show the state of the user determined based on the user's utterance, but the candidate states may also include states related to the user's actions (e.g., a person approaching, a person stopping, a person waving, etc.) detected from an image obtained by the camera 20. The candidate states represent, for example, states that the user can take in terms of topics that are expected to be exchanged before moving toward a goal to be achieved in a dialogue with the user.

例えば、ユーザとの対話で達成したいゴールが、「ラーメン店を推薦する」や「パスタ店を推薦する」であるとする。この場合、ゴールに向かうためにやり取りされることが想定される話題として、「疲れた？」、「お腹空いている？」、「ラーメン食べたい？」、「パスタ食べたい？」、「ラーメン店紹介ＯＫ？」、「パスタ店紹介ＯＫ？」等が考えられる。これらの話題を踏まえると、ユーザは、疲れている状態、お腹が空いている状態、ラーメンを食べたいと思っている状態、パスタを食べたいと思っている状態、ラーメン店を紹介してほしいと思っている状態、パスタ店を紹介してほしいと思っている状態、ラーメン店の推薦を聞いた状態、パスタ店の推薦を聞いた状態が、候補状態として挙げられる。 For example, suppose the goal to be achieved in a dialogue with a user is "recommend a ramen shop" or "recommend a pasta restaurant." In this case, possible topics that may be exchanged to reach the goal include "Are you tired?", "Are you hungry?", "Do you want to eat ramen?", "Do you want to eat pasta?", "Can I recommend a ramen shop?", "Can I recommend a pasta restaurant?", etc. Taking these topics into consideration, the following states of the user are possible: being tired, being hungry, wanting to eat ramen, wanting to eat pasta, wanting a ramen shop recommendation, wanting a pasta restaurant recommendation, having heard a ramen shop recommendation, and having heard a pasta restaurant recommendation.

変数としては、ユーザの状態が、対応付けられている候補状態であることを表すＹ（Ｙｅｓ）と、ユーザの状態が、対応付けられている候補状態ではないことを表すＮ（Ｎｏ）と、ユーザの状態が、対応付けられている候補状態であるか否かが特定されていないことを表すＵ（Ｕｎｋｎｏｗｎ）とが用いられる。Ｙは、第１変数の一態様であり、Ｎは、第２変数の一態様であり、Ｕは、第３変数の一態様である。対話装置１０による処理開始時には、候補状態は変数が全て初期値（例えば、Ｕ）となっている。 The variables used are Y (Yes), which indicates that the user's state is a corresponding candidate state, N (No), which indicates that the user's state is not a corresponding candidate state, and U (Unknown), which indicates that it is not specified whether the user's state is a corresponding candidate state or not. Y is one aspect of the first variable, N is one aspect of the second variable, and U is one aspect of the third variable. When the dialogue device 10 starts processing, all of the candidate state variables are set to initial values (e.g., U).

上述した話題及び候補状態の内容は、一例であり、設計者が自由に話題及び候補状態の内容を設定すればよい。以下の説明では、上述した話題及び候補状態の内容を用いて構成を例に説明する。 The above-mentioned topics and candidate states are just examples, and designers can freely set the topics and candidate states. In the following explanation, we will use the above-mentioned topics and candidate states as examples to explain the configuration.

図１に戻って説明を続ける。話題モジュールセット１１３は、所定の条件が満たされた場合に対話の相手であるユーザに提供する話題が示された話題モジュールを複数含んで構成される。図３は、話題モジュールセット１１３の一例（その１）を示す図である。話題モジュールセット１１３は、例えば、図３に示すように予め定められた優先順位で各話題モジュール１１３－１～１１３－９が階層構造に配置されて構成される。どのような優先順位で階層構造に並べるのかは、設計者の目的に応じて自由に変更可能である。 Returning to Figure 1, the explanation continues. The topic module set 113 is composed of multiple topic modules that indicate the topics to be provided to the user who is the conversation partner when certain conditions are met. Figure 3 is a diagram showing one example (part 1) of a topic module set 113. The topic module set 113 is composed, for example, of topic modules 113-1 to 113-9 arranged in a hierarchical structure with a predetermined priority as shown in Figure 3. The priority of the hierarchical structure can be freely changed depending on the purpose of the designer.

図３に示す例では、全ての話題モジュール１１３－ｎ（ｎは１以上の整数）を、ゴールから近い距離（ゴールまでに経由する話題の数）順に並べている。例えば、ユーザとの対話で達成したいゴールが、「ラーメン店を推薦する」と「パスタ店を推薦する」とであり、「ラーメン店を推薦する」というゴールに至るまでの話題が「お腹空いている？」⇒「ラーメン食べたい？」⇒「ラーメン店紹介ＯＫ？」⇒「ラーメン店を推薦する」であり、「パスタ店を推薦する」というゴールに至るまでの話題が「疲れた？」⇒「お腹空いている？」⇒「パスタ食べたい？」⇒「パスタ店紹介ＯＫ？」⇒「パスタ店を推薦する」であるとする。 In the example shown in FIG. 3, all topic modules 113-n (n is an integer equal to or greater than 1) are arranged in order of distance from the goal (the number of topics passed through to reach the goal). For example, suppose that the goals to be achieved in a dialogue with a user are "recommend a ramen restaurant" and "recommend a pasta restaurant", and the topics leading to the goal "recommend a ramen restaurant" are "Are you hungry?" ⇒ "Want to eat ramen?" ⇒ "Can we recommend a ramen restaurant?" ⇒ "Recommend a ramen restaurant", and the topics leading to the goal "recommend a pasta restaurant" are "Are you tired?" ⇒ "Are you hungry?" ⇒ "Want to eat pasta?" ⇒ "Can we recommend a pasta restaurant?" ⇒ "Recommend a pasta restaurant".

この場合、「疲れた？」との話題が最もゴールまでの距離が遠く（距離＝４）、「お腹空いている？」との話題が次にゴールまでの距離が遠く（距離＝３）、「ラーメン食べたい？」及び「パスタ食べたい？」との話題が次にゴールまでの距離が遠く（距離＝２）、「ラーメン店紹介ＯＫ？」及び「パスタ店紹介ＯＫ？」との話題が最もゴールまでの距離が近い（距離＝１）。そして、各話題を予め定められた優先順位（例えば、ゴールから近い距離順、かつ、パスタよりラーメン優先）で並べて、対応する起動条件を設定することで図３に示す構成となる。 In this case, the topic "Are you tired?" is the furthest from the goal (distance = 4), followed by "Are you hungry?" (distance = 3), followed by "Do you want to eat ramen?" and "Do you want to eat pasta?" (distance = 2), and then the topics "Can I introduce you to a ramen shop?" and "Can I introduce you to a pasta shop?" (distance = 1). Then, by arranging each topic in a predetermined order of priority (for example, in order of distance from the goal, and giving priority to ramen over pasta), and setting the corresponding activation conditions, the configuration shown in Figure 3 is obtained.

起動条件は、対応付けられた話題の内容をユーザに対して出力するための条件であり、例えば状態情報１１２で示される各状態のいずれか２つ以上の組み合わせで構成される。起動条件は、上記のように話題の並べ方に応じて変更する必要があるが、ユーザの発話内容を加味した条件とすることができる。 The start-up condition is a condition for outputting the contents of the associated topic to the user, and is composed of, for example, a combination of two or more of the states indicated in the state information 112. The start-up condition needs to be changed according to the arrangement of the topics as described above, but can also be a condition that takes into account the contents of the user's speech.

図１に戻って説明を続ける。出力言語情報１１４は、話題モジュールセット１１３に基づいて決定された話題に応じた対話用の文字列の情報である。図４は、出力言語情報１１４の一例を示す図である。図４に示すように、出力言語情報１１４は話題と、出力音声文字列とが対応付けられる。例えば、話題モジュールセット１１３に基づいて決定された話題が“疲れているか聞く”である場合、対話用の文字列として“疲れてない？”が選択されることが表されている。 Returning to FIG. 1 for further explanation, the output language information 114 is information on a character string for dialogue corresponding to a topic determined based on the topic module set 113. FIG. 4 is a diagram showing an example of the output language information 114. As shown in FIG. 4, the output language information 114 associates a topic with an output speech character string. For example, when the topic determined based on the topic module set 113 is "ask if you are tired", it is shown that "Aren't you tired?" is selected as the character string for dialogue.

図１に戻って説明を続ける。動作制御情報１１５は、表示装置５０に表示させるエージェントの動作を制御するための情報を含む。例えば、動作制御情報１１５は、話題又は出力音声文字列と、制御内容とが対応付けられたテーブルであってもよい。制御内容は、エージェントの動作（例えば、表情、身振り手振りなど）を制御するための内容である。 Returning to FIG. 1 for further explanation, the action control information 115 includes information for controlling the action of the agent to be displayed on the display device 50. For example, the action control information 115 may be a table in which a topic or an output voice string is associated with a control content. The control content is content for controlling the action of the agent (e.g., facial expressions, gestures, etc.).

図１に戻って説明を続ける。制御部１２は、対話装置１０全体を制御する。制御部１２は、ＣＰＵ（Central Processing Unit）等のプロセッサやメモリを用いて構成される。制御部１２は、プログラムを実行することによって、話題モジュールセット作成部１２０と、状態情報作成部１２１と、検出部１２２と、音声認識部１２３と、解析部１２４と、状態更新部１２５と、話題決定部１２６と、言語生成部１２７と、音声合成部１２８と、動作制御部１２９の機能を実現する。 Returning to FIG. 1, the explanation will continue. The control unit 12 controls the entire dialogue device 10. The control unit 12 is configured using a processor such as a CPU (Central Processing Unit) and a memory. By executing a program, the control unit 12 realizes the functions of a topic module set creation unit 120, a state information creation unit 121, a detection unit 122, a voice recognition unit 123, an analysis unit 124, a state update unit 125, a topic determination unit 126, a language generation unit 127, a voice synthesis unit 128, and an operation control unit 129.

話題モジュールセット作成部１２０、状態情報作成部１２１、検出部１２２、音声認識部１２３、解析部１２４、状態更新部１２５、話題決定部１２６、言語生成部１２７、音声合成部１２８及び動作制御部１２９のうち一部または全部は、ＡＳＩＣ（Application Specific Integrated Circuit）やＰＬＤ（Programmable Logic Device）、ＦＰＧＡ（Field Programmable Gate Array）などのハードウェア（回路部；circuitryを含む）によって実現されてもよいし、ソフトウェアとハードウェアとの協働によって実現されてもよい。プログラムは、コンピュータ読み取り可能な記録媒体に記録されてもよい。コンピュータ読み取り可能な記録媒体とは、例えばフレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ－ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置などの非一時的な記憶媒体である。プログラムは、電気通信回線を介して送信されてもよい。 Part or all of the topic module set creation unit 120, state information creation unit 121, detection unit 122, voice recognition unit 123, analysis unit 124, state update unit 125, topic determination unit 126, language generation unit 127, voice synthesis unit 128, and operation control unit 129 may be realized by hardware (including circuitry) such as an ASIC (Application Specific Integrated Circuit), a PLD (Programmable Logic Device), or an FPGA (Field Programmable Gate Array), or may be realized by a combination of software and hardware. The program may be recorded on a computer-readable recording medium. A computer-readable recording medium is, for example, a non-transitory storage medium such as a portable medium such as a flexible disk, a magneto-optical disk, a ROM, or a CD-ROM, or a storage device such as a hard disk built into a computer system. The program may be transmitted via a telecommunications line.

話題モジュールセット作成部１２０、状態情報作成部１２１、検出部１２２、音声認識部１２３、解析部１２４、状態更新部１２５、話題決定部１２６、言語生成部１２７、音声合成部１２８及び動作制御部１２９の機能の一部は、予め対話装置１０に搭載されている必要はなく、追加のアプリケーションプログラムが対話装置１０にインストールされることで実現されてもよい。 Some of the functions of the topic module set creation unit 120, state information creation unit 121, detection unit 122, voice recognition unit 123, analysis unit 124, state update unit 125, topic determination unit 126, language generation unit 127, voice synthesis unit 128 and operation control unit 129 do not need to be pre-installed in the dialogue device 10, and may be realized by installing additional application programs in the dialogue device 10.

話題モジュールセット作成部１２０は、一方向の対話の流れが記述された１以上のシナリオに基づいて話題モジュールセットを作成する。話題モジュールセット作成部１２０は、作成した話題モジュールセットを記憶部１１に記憶する。話題モジュールセット作成部１２０によって作成された話題モジュールセットが、記憶部１１に記憶されている話題モジュールセット１１３である。シナリオは、予めユーザによって作成される。シナリオは、外部の装置で作成されて対話装置１０に入力されてもよいし、ユーザが対話装置１０を操作して作成してもよい。 The topic module set creation unit 120 creates a topic module set based on one or more scenarios that describe a one-way dialogue flow. The topic module set creation unit 120 stores the created topic module set in the memory unit 11. The topic module set created by the topic module set creation unit 120 is the topic module set 113 stored in the memory unit 11. The scenario is created in advance by the user. The scenario may be created by an external device and input to the dialogue device 10, or may be created by the user operating the dialogue device 10.

状態情報作成部１２１は、シナリオを構成する各話題を候補状態とし、各候補状態に変数を対応付けることによって状態情報を作成する。状態情報作成部１２１は、作成した状態情報を記憶部１１に記憶する。状態情報作成部１２１によって作成された状態情報が、記憶部１１に記憶されている状態情報１１２である。 The state information creation unit 121 creates state information by treating each topic constituting the scenario as a candidate state and associating variables with each candidate state. The state information creation unit 121 stores the created state information in the storage unit 11. The state information created by the state information creation unit 121 is the state information 112 stored in the storage unit 11.

検出部１２２は、カメラ２０によって撮像された動画像に基づいて人物の行動を検知する。人物の行動としては、例えば対話装置１０に人物が近づいてくる、人物が立ち止まる、人物が手を振る等の人物が行う動作や振る舞いである。なお、人物の行動を検知する方法は、これに限られず、人物を検知できる方法であればどのような方法であってもよい。例えば、検出部１２２は、不図示のセンサにより検出された情報に基づいて人物の行動を検知してもよい。例えば、検出部１２２は、他の装置からユーザの行動に関する情報が入力されたことを契機に、人物の行動を検知してもよい。 The detection unit 122 detects the behavior of a person based on video images captured by the camera 20. Examples of the behavior of a person include the movement or behavior of a person approaching the dialogue device 10, stopping, waving, etc. Note that the method of detecting the behavior of a person is not limited to this, and any method capable of detecting a person may be used. For example, the detection unit 122 may detect the behavior of a person based on information detected by a sensor (not shown). For example, the detection unit 122 may detect the behavior of a person in response to information regarding the user's behavior being input from another device.

音声認識部１２３は、音声認識処理を実行する。音声認識処理は、音声信号に基づいて文字列を生成する処理である。音声認識部１２３は、音声認識処理を実行することで、マイク３０から出力された音声信号に基づいて文字列を生成する。音声認識部１２３は、公知の手法を用いて文字列を生成してもよい。 The voice recognition unit 123 executes a voice recognition process. The voice recognition process is a process for generating a character string based on a voice signal. The voice recognition unit 123 executes the voice recognition process to generate a character string based on a voice signal output from the microphone 30. The voice recognition unit 123 may generate a character string using a known method.

解析部１２４は、音声認識部１２３により生成された文字列と、記憶部１１に記憶されている辞書１１１とを用いて自然言語処理を行うことでユーザの発話内容を解析する。 The analysis unit 124 analyzes the content of the user's speech by performing natural language processing using the character string generated by the voice recognition unit 123 and the dictionary 111 stored in the memory unit 11.

状態更新部１２５は、解析部１２４により解析された発話内容（ユーザの発話内容）又は検出部１２２により検知された人物の行動に基づいて、状態情報１１２を更新する。具体的には、状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、発話内容に応じた候補状態の変数を更新する。さらに、状態更新部１２５は、検出部１２２により検知された人物の行動に基づいて、発話内容に応じた候補状態の変数を更新する。例えば、検出部１２２により検知された人物の行動が「人物が手を振っている」動作である場合、状態更新部１２５は状態情報１１２における「人物が手を振っている動作」に対応付けられている変数を更新する。状態更新部１２５は、状態情報１１２を更新したことを話題決定部１２６に通知する。 The state update unit 125 updates the state information 112 based on the speech content (user's speech content) analyzed by the analysis unit 124 or the behavior of the person detected by the detection unit 122. Specifically, the state update unit 125 updates the variables of the candidate states according to the speech content based on the speech content analyzed by the analysis unit 124. Furthermore, the state update unit 125 updates the variables of the candidate states according to the speech content based on the behavior of the person detected by the detection unit 122. For example, if the behavior of the person detected by the detection unit 122 is the behavior of "a person waving his/her hand", the state update unit 125 updates the variables associated with "the behavior of a person waving his/her hand" in the state information 112. The state update unit 125 notifies the topic determination unit 126 that the state information 112 has been updated.

話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されたことに応じて、状態情報１１２と話題モジュールセット１１３とに基づいて、ユーザの状態に応じた話題を決定する。具体的には、話題決定部１２６は、状態情報１１２で示されるユーザの状態に基づいて話題モジュールセット１１３を構成する話題モジュール１１３－ｎにおけるいずれかの起動条件を満たすか否かを判定し、条件が満たされた起動条件に対応付けられた話題のうち、優先順位の高い話題をユーザの状態に応じた話題として決定する。 In response to the state information 112 being updated by the state update unit 125, the topic determination unit 126 determines a topic according to the user's state based on the state information 112 and the topic module set 113. Specifically, the topic determination unit 126 determines whether or not any of the activation conditions in the topic modules 113-n constituting the topic module set 113 is satisfied based on the user's state indicated by the state information 112, and determines the topic with the highest priority among the topics associated with the satisfied activation conditions as the topic according to the user's state.

言語生成部１２７は、話題決定部１２６により決定された話題と、出力言語情報１１４とに基づいて、音声出力させる文字列を生成する。 The language generation unit 127 generates a string of characters to be output as speech based on the topic determined by the topic determination unit 126 and the output language information 114.

音声合成部１２８は、言語生成部１２７により生成された文字列に対応する音声信号を生成する。音声合成部１２８により生成された音声信号は、スピーカー４０から出力される。すなわち、音声合成部１２８は、決定された話題による内容をスピーカー４０から出力させる。音声合成部１２８は、出力部の位置態様である。 The voice synthesis unit 128 generates a voice signal corresponding to the character string generated by the language generation unit 127. The voice signal generated by the voice synthesis unit 128 is output from the speaker 40. In other words, the voice synthesis unit 128 causes the content according to the determined topic to be output from the speaker 40. The voice synthesis unit 128 is a positional aspect of the output unit.

動作制御部１２９は、話題決定部１２６により決定された話題又は話題に基づく情報と、動作制御情報１１５とに基づいて、表示装置５０に表示させるエージェントの動作を制御する。具体的には、動作制御部１２９は、動作制御情報１１５を参照し、話題決定部１２６により決定された話題に対応付けられた制御内容を取得する。動作制御部１２９は、取得した制御内容に応じた動作を行うエージェントの映像情報を生成して、生成した映像情報を表示装置５０に表示させることによって、エージェントの動作を制御する。 The operation control unit 129 controls the operation of the agent to be displayed on the display device 50 based on the topic or information based on the topic determined by the topic determination unit 126 and the operation control information 115. Specifically, the operation control unit 129 refers to the operation control information 115 and acquires the control content associated with the topic determined by the topic determination unit 126. The operation control unit 129 controls the operation of the agent by generating video information of the agent performing an operation according to the acquired control content and displaying the generated video information on the display device 50.

なお、動作制御情報１１５として、出力音声文字列と、制御内容とが対応付けられたテーブルが用いられる場合、動作制御部１２９は、出力言語情報１１４を参照し、話題決定部１２６により決定された話題に対応付けられた出力音声文字列を取得する。動作制御部１２９は、動作制御情報１１５を参照し、取得した出力音声文字列に対応付けられた制御内容を取得する。動作制御部１２９は、取得した制御内容に応じた動作を行うエージェントの映像情報を生成して、生成した映像情報を表示装置５０に表示させることによって、エージェントの動作を制御する。 When a table in which output voice character strings and control contents are associated is used as the operation control information 115, the operation control unit 129 refers to the output language information 114 and acquires the output voice character string associated with the topic determined by the topic determination unit 126. The operation control unit 129 refers to the operation control information 115 and acquires the control contents associated with the acquired output voice character string. The operation control unit 129 generates video information of an agent performing an operation according to the acquired control contents, and controls the operation of the agent by displaying the generated video information on the display device 50.

［話題モジュールセット及び状態情報の作成］
次に話題モジュールセット及び状態情報を作成する方法について具体的に説明する。話題モジュールセット及び状態情報を作成するために、話題モジュールセット作成部１２０が用いるシナリオを図５に示す。図５は、実施形態におけるシナリオの一例を示す図である。図５には、３つのシナリオＳＣ１，ＳＣ２，ＳＣ３を示している。シナリオＳＣ１は、ユーザとの対話で達成したいゴールが「ラーメン店を推薦する」ことを想定したシナリオである。そこで、シナリオＳＣ１において、ゴールに向かうためにやり取りされることが想定される一方向の対話の流れの一例として、「お腹空いている？」⇒「ラーメン食べたい？」⇒「ラーメン店紹介ＯＫ？」⇒「ラーメン店を推薦する」といった順番で話題が設定されている。このように１つのシナリオＳＣは、複数の話題で構成されている。 [Creating topic module sets and state information]
Next, a method for creating a topic module set and state information will be specifically described. A scenario used by the topic module set creation unit 120 to create a topic module set and state information is shown in FIG. 5. FIG. 5 is a diagram showing an example of a scenario in the embodiment. FIG. 5 shows three scenarios SC1, SC2, and SC3. Scenario SC1 is a scenario that assumes that the goal to be achieved in a dialogue with a user is "recommending a ramen shop". In scenario SC1, the topics are set in the following order as an example of a one-way dialogue flow that is assumed to be exchanged toward the goal: "Are you hungry?"->"Want to eat ramen?"->"Can I recommend a ramen shop?"->"Recommend a ramen shop". In this way, one scenario SC is composed of multiple topics.

シナリオＳＣ２は、ユーザとの対話で達成したいゴールが「パスタ店を推薦する」ことを想定したシナリオである。そこで、シナリオＳＣ２において、ゴールに向かうためにやり取りされることが想定される一方向の対話の流れの一例として、「疲れた？」⇒「お腹空いている？」⇒「パスタ食べたい？」⇒「パスタ店紹介ＯＫ？」⇒「パスタ店を推薦する」といった順番で話題が設定されている。 Scenario SC2 is a scenario that assumes that the goal to be achieved in a dialogue with a user is to "recommend a pasta restaurant." In scenario SC2, therefore, as an example of a one-way dialogue flow that is expected to take place toward the goal, the topics are set in the following order: "Are you tired?" ⇒ "Are you hungry?" ⇒ "Want to eat pasta?" ⇒ "Can I recommend a pasta restaurant?" ⇒ "Recommend a pasta restaurant."

シナリオＳＣ３は、ユーザとの対話で達成したいゴールが「マッサージ店を推薦する」ことを想定したシナリオである。そこで、シナリオＳＣ３において、ゴールに向かうためにやり取りされることが想定される一方向の対話の流れの一例として、「疲れてない？」⇒「癒す方法を知りたい？」⇒「マッサージ店紹介ＯＫ？」⇒「マッサージ店を推薦する」といった順番で話題が設定されている。 Scenario SC3 is a scenario that assumes that the goal to be achieved in a dialogue with a user is to "recommend a massage parlor." In scenario SC3, therefore, as an example of a one-way dialogue flow that is expected to take place toward the goal, the topics are set in the following order: "Are you tired?" ⇒ "Would you like to know how to relax?" ⇒ "Can I recommend a massage parlor?" ⇒ "Recommend a massage parlor."

図５に示すように、各シナリオＳＣには、話題を変えるといったような分岐はない。話題を変えるといったような分岐をさせたい場合には、途中から別の流れになるシナリオＳＣを新たに作成すればよい。図５に示す各シナリオＳＣは、１つの話題に応じた内容に対して対話の相手であるユーザが肯定的な意見を応答する場合のみを想定して作成している。シナリオＳＣ１では、例えば、「お腹空いている？」という話題に応じた内容（例えば、「お腹空いている？」）に対して、対話の相手であるユーザが「はい」や「お腹空いている」といったような肯定的な意見を応答することを想定し、「お腹空いている？」という話題の次に「ラーメン食べたい？」といった話題を設定している。 As shown in FIG. 5, each scenario SC does not have a branch such as changing the topic. If you want to branch, such as changing the topic, just create a new scenario SC that starts with a different flow halfway through. Each scenario SC shown in FIG. 5 is created assuming only the case where the user who is the conversation partner will respond positively to content related to one topic. For example, in scenario SC1, it is assumed that the user who is the conversation partner will respond positively to content related to the topic "Are you hungry?" (for example, "Are you hungry?"), such as "Do you want to eat ramen?" is set after the topic "Are you hungry?".

なお、否定的な意見（例えば、「いいえ」や「違う」等）を応答する場合を想定してシナリオＳＣを作成することもできるが、簡潔で明快な説明のため以下で説明するシナリオＳＣとしては、対話の相手であるユーザが肯定的な意見を応答する場合のみを想定したシナリオＳＣを例に説明する。図５に示した各シナリオＳＣは一例であり、シナリオＳＣ内の話題は適宜変更されてもよい。 It is also possible to create a scenario SC assuming a case where a negative opinion (for example, "No" or "That's not right") is responded to, but for the sake of concise and clear explanation, the scenario SC described below assumes only a case where the user who is the other party in the conversation responds with a positive opinion. Each scenario SC shown in FIG. 5 is an example, and the topic within the scenario SC may be changed as appropriate.

まず状態情報の作成方法について図６及び図７を用いて説明する。状態情報作成部１２１は、上述したように作成された１以上のシナリオＳＣを入力とする。状態情報作成部１２１は、入力したシナリオＳＣに基づいて、シナリオＳＣを構成する各話題を候補状態とし、各候補状態に変数を対応付けることによって状態情報を作成する。具体的には、図６に示すように、まず状態情報作成部１２１は、入力した各シナリオＳＣを話題毎に分割する。次に、状態情報作成部１２１は、分割した各話題を候補状態として、各候補状態に対して変数（例えば、初期値として「Ｕ」）を対応付ける。変数Ｕは、上述したように、ユーザの状態が、対応付けられている候補状態であるか否かが特定されていないことを表す変数である。次に、状態情報作成部１２１は、図７に示すように、複数の候補状態の中で同じ意味になる候補状態を検索する。 First, the method of creating state information will be described with reference to FIG. 6 and FIG. 7. The state information creation unit 121 receives one or more scenarios SC created as described above. Based on the input scenario SC, the state information creation unit 121 creates state information by setting each topic constituting the scenario SC as a candidate state and associating a variable with each candidate state. Specifically, as shown in FIG. 6, the state information creation unit 121 first divides each input scenario SC into topics. Next, the state information creation unit 121 sets each divided topic as a candidate state and associates a variable (for example, "U" as an initial value) with each candidate state. As described above, the variable U is a variable that indicates that it is not specified whether the user's state is the associated candidate state or not. Next, the state information creation unit 121 searches for a candidate state that has the same meaning among multiple candidate states, as shown in FIG. 7.

状態情報作成部１２１は、複数の候補状態の中で同じ意味になる候補状態を検索する方法は、特に限定されない。例えば、事前学習済み言語モデルが用いられてもよいし、２変数が同じ意味であることを示す教師ラベルを学習させた学習済みモデルが用いられてもよい。事前学習済み言語モデルを用いる場合、状態情報作成部１２１は、ベクトルの類似度によって、複数の候補状態の中で同じ意味になる候補状態を検索する。言語モデルは、Ｔｒａｎｓｆｏｍｅｒベースのモデルであってもよい。状態情報作成部１２１は、上述したいずれかの方法によって同じ意味になる候補状態を検索する。図７に示す例では、状態情報作成部１２１は、「お腹空いている？」という２つの候補状態が同じ意味の候補状態であると検索され、「疲れた？」と「疲れてない？」という２つの候補状態が同じ意味の候補状態であると検索される。 The state information creation unit 121 is not particularly limited in the method of searching for a candidate state having the same meaning among a plurality of candidate states. For example, a pre-trained language model may be used, or a trained model in which a teacher label indicating that two variables have the same meaning may be used. When a pre-trained language model is used, the state information creation unit 121 searches for a candidate state having the same meaning among a plurality of candidate states based on the similarity of vectors. The language model may be a transformer-based model. The state information creation unit 121 searches for a candidate state having the same meaning by any of the above-mentioned methods. In the example shown in FIG. 7, the state information creation unit 121 searches for two candidate states, "Are you hungry?", as candidate states having the same meaning, and searches for two candidate states, "Are you tired?" and "Are you not tired?", as candidate states having the same meaning.

状態情報作成部１２１は、検索結果として得られた同じ意味の複数の候補状態のうち、１つの候補状態を選択し、残りの候補状態を削除する。図７に示す例では、状態情報作成部１２１は、例えば「お腹空いている？」という２つの候補状態のうち、「お腹空いている？」を示す１つの候補状態を選択し、残りの「お腹空いている？」を示す候補状態を削除する。さらに、図７に示す例では、状態情報作成部１２１は、例えば「疲れた？」と「疲れてない？」という２つの候補状態のうち、「疲れた？」を示す１つの候補状態を選択し、残りの「疲れてない？」を示す候補状態を削除する。このようにして、状態情報作成部１２１は、同じ意味の候補状態が複数存在しないように調整を行う。そして、状態情報作成部１２１は、上述した処理の結果として得られた候補状態と変数の組み合わせをまとめることによって図２に示す状態情報１１２を作成する。 The state information creation unit 121 selects one candidate state from among multiple candidate states with the same meaning obtained as a search result, and deletes the remaining candidate states. In the example shown in FIG. 7, the state information creation unit 121 selects one candidate state indicating "Are you hungry?" from two candidate states, for example, "Are you hungry?", and deletes the remaining candidate state indicating "Are you hungry?". Furthermore, in the example shown in FIG. 7, the state information creation unit 121 selects one candidate state indicating "Are you tired?" from two candidate states, for example, "Are you tired?" and "Are you not tired?", and deletes the remaining candidate state indicating "Are you not tired?". In this way, the state information creation unit 121 makes adjustments so that multiple candidate states with the same meaning do not exist. Then, the state information creation unit 121 creates the state information 112 shown in FIG. 2 by combining the combinations of candidate states and variables obtained as a result of the above-mentioned processing.

次に話題モジュールセットの作成方法について図８～図１９を用いて説明する。話題モジュールセット作成部１２０は、上述したように作成された１以上のシナリオＳＣを入力とする。話題モジュールセット作成部１２０は、入力した１以上のシナリオＳＣを構成する各話題を話題モジュールとして作成する。この際、話題モジュールセット作成部１２０は、同じ意味を示す話題については用語を統一して話題モジュールとして作成してもよい。例えば、シナリオＳＣ２を構成する話題の１つである「疲れた？」と、シナリオＳＣ３を構成する話題の１つである「疲れてない？」とは同じ意味を示す話題である。そこで、話題モジュールセット作成部１２０は、「疲れた？」又は「疲れてない？」のいずれかの用語に統一して話題モジュールとして作成する。図８では、ＳＣ３を構成する話題の１つである「疲れてない？」を「疲れた？」に変更して話題モジュールとして作成した場合を示している。 Next, a method for creating a topic module set will be described with reference to Figs. 8 to 19. The topic module set creation unit 120 receives one or more scenarios SC created as described above. The topic module set creation unit 120 creates each topic constituting the one or more input scenarios SC as a topic module. In this case, the topic module set creation unit 120 may unify the terms for topics that have the same meaning and create them as a topic module. For example, "Are you tired?", which is one of the topics constituting scenario SC2, and "Are you tired?", which is one of the topics constituting scenario SC3, have the same meaning. Therefore, the topic module set creation unit 120 unifies the terms to either "Are you tired?" or "Are you tired?" and creates a topic module. Fig. 8 shows a case where "Are you tired?", which is one of the topics constituting SC3, is changed to "Are you tired?" and created as a topic module.

次に、話題モジュールセット作成部１２０は、話題のつながりに基づいて、各話題モジュールに対して起動条件を設定する。話題モジュールセット作成部１２０は、起動条件を設定するための手順として、３つの手順（起動条件の設定１～設定３）を行う。まず話題モジュールセット作成部１２０は、起動条件の設定１として、図９及び図１０に示すように、各話題モジュールに対して、起動条件（ＩＦ：話題に対応する変数＝＝Ｕｎｋｎｏｗｎ）を対応付ける。起動条件の設定１の目的は、ユーザが一度答えた話題を繰り返さないことである。 Next, the topic module set creation unit 120 sets activation conditions for each topic module based on topic connections. The topic module set creation unit 120 performs three procedures (activation condition setting 1 to setting 3) as a procedure for setting activation conditions. First, as activation condition setting 1, the topic module set creation unit 120 associates an activation condition (IF: variable corresponding to topic == Unknown) with each topic module as shown in Figures 9 and 10. The purpose of activation condition setting 1 is to prevent the user from repeating a topic that has already been answered.

例えば、「お腹空いている？」と聞いた後に、再度「お腹空いている？」と聞かないようにするために、「お腹空いている？」という話題に対応する内容を出力するための起動条件として、話題モジュールセット作成部１２０は、「お腹空いている？」という話題に対してＩＦ：お腹空いている＝＝Ｕｎｋｎｏｗｎを設定する。これは、ユーザの状態が、お腹空いている状態か否かが特定されていない場合にのみ起動することを意味する。例えば、お腹空いているか否かを一度ユーザに問い合わせた場合、ユーザの回答に応じてユーザの状態が変化（候補状態に対応付けられた変数が変化）するため、ＩＦ：お腹空いている＝＝Ｕｎｋｎｏｗｎという条件を満たさなくなる。その結果、「お腹空いている？」という話題を選択しなくなる。これにより、起動条件の設定１の目的であるユーザが一度答えた話題を繰り返さないことを満たすことができる。ここでは、「お腹空いている？」という話題を例に説明したが、図９及び図１０に示すように他の話題モジュールに対しても同様に、ユーザが一度答えた話題を繰り返さないようにするための起動条件が設定される。 For example, in order to avoid asking "Are you hungry?" again after asking "Are you hungry?", the topic module set creation unit 120 sets IF: hungry == Unknown for the topic "Are you hungry?" as a start-up condition for outputting content corresponding to the topic "Are you hungry?". This means that the module is started only when it is not specified whether the user is hungry or not. For example, if the user is once asked whether he is hungry or not, the user's state changes according to the user's answer (the variable associated with the candidate state changes), so the condition IF: hungry == Unknown is no longer satisfied. As a result, the topic "Are you hungry?" is not selected. This achieves the purpose of setting the start-up condition 1, which is to not repeat a topic that the user has already answered. Here, the topic "Are you hungry?" has been used as an example, but start-up conditions are set for other topic modules in the same way as shown in Figures 9 and 10 to prevent the user from repeating a topic that the user has already answered.

次に話題モジュールセット作成部１２０は、起動条件の設定２として、図１１～図１３に示すように、各シナリオＳＣを構成する各話題を基準として、達成したいゴールへ向かう話題に関する起動条件（ＩＦ：各話題の未来の話題＝＝Ｕｎｋｎｏｗｎ）を追加する。例えば、シナリオＳＣ１のように「お腹空いている？」⇒「ラーメン食べたい？」⇒「ラーメン店紹介ＯＫ？」⇒「ラーメン店を推薦する」といった順番で話題が設定されている場合、「お腹空いている？」を基準として、達成したいゴール（「ラーメン店を推薦する」）へ向かうために想定される話題は、「ラーメン食べたい？」と、「ラーメン店紹介ＯＫ？」と、「ラーメン店を推薦する」である。そこで、話題モジュールセット作成部１２０は、「お腹空いている？」の話題に対応付けられている起動条件に対して、ラーメン食べたい＝＝Ｕｎｋｎｏｗｎと、ラーメン店紹介ＯＫ＝＝Ｕｎｋｎｏｗｎと、ラーメン店推薦聞いた＝＝Ｕｎｋｎｏｗｎという条件を追加で設定する。 Next, as shown in Figs. 11 to 13, the topic module set creation unit 120 adds a start condition (IF: future topic of each topic == Unknown) related to the topic toward the goal to be achieved, based on each topic constituting each scenario SC, as the start condition setting 2. For example, in the case of scenario SC1, where the topics are set in the order of "Are you hungry?" ⇒ "Want to eat ramen?" ⇒ "Can I recommend a ramen shop?" ⇒ "Recommend a ramen shop", the topics assumed to be used to reach the goal to be achieved ("Recommend a ramen shop") based on "Are you hungry?" are "Want to eat ramen?", "Can I recommend a ramen shop?", and "Recommend a ramen shop". Therefore, the topic module set creation unit 120 sets additional conditions of "Want to eat ramen == Unknown", "Can I recommend a ramen shop == Unknown", and "Have I heard a ramen shop recommended == Unknown" to the start condition associated with the topic of "Are you hungry?".

同様に、「ラーメン食べたい？」を基準として、達成したいゴール（「ラーメン店を推薦する」）へ向かうために想定される話題は、「ラーメン店紹介ＯＫ？」と、「ラーメン店を推薦する」である。そこで、話題モジュールセット作成部１２０は、「ラーメン食べたい？」の話題に対応付けられている起動条件に対して、ラーメン店紹介ＯＫ＝＝Ｕｎｋｎｏｗｎと、ラーメン店推薦聞いた＝＝Ｕｎｋｎｏｗｎという条件を追加で設定する。同様に、「ラーメン店紹介ＯＫ？」を基準として、達成したいゴール（「ラーメン店を推薦する」）へ向かうために想定される話題は、「ラーメン店を推薦する」である。そこで、話題モジュールセット作成部１２０は、「ラーメン店紹介ＯＫ？」の話題に対応付けられている起動条件に対して、ラーメン店推薦聞いた＝＝Ｕｎｋｎｏｗｎという条件を追加で設定する。 Similarly, using "Do you want to eat ramen?" as a criterion, the topics expected to lead to the goal to be achieved ("recommend a ramen shop") are "Can I introduce a ramen shop?" and "Recommend a ramen shop." Therefore, the topic module set creation unit 120 sets additional conditions of "Can I introduce a ramen shop?" == Unknown and "Have I heard a ramen shop recommendation?" == Unknown to the start-up conditions associated with the topic of "Do you want to eat ramen?". Similarly, using "Can I introduce a ramen shop?" as a criterion, the topic expected to lead to the goal to be achieved ("Recommend a ramen shop") is "Recommend a ramen shop." Therefore, the topic module set creation unit 120 sets additional conditions of "Have I heard a ramen shop recommendation?" == Unknown to the start-up conditions associated with the topic of "Can I introduce a ramen shop?".

以上の説明はシナリオＳＣ１に関する内容であるが、話題モジュールセット作成部１２０は同様の処理を他のシナリオＳＣ（例えば、シナリオＳＣ２及びシナリオＳＣ３）に対しても行う。これにより、図１１～図１３に示すように起動条件が追加で設定される。設定２の目的は、到達したい話題が到達済みとなっている話題モジュールを選択しないこと、である。例えば、「お腹空いている？」という話題は、「お腹空いている＝＝ＹＥＳｏｒＮＯ」に関連する次の話題をシステムが有しているとユーザは想起できると考えられる。そのため、「お腹空いている？」に返答をした際に、「お腹空いている」に関連しない次の話題が続くと、ユーザは「お腹空いている？」という話題が何のために行われたか不思議に感じられると思われる。そのような事態を防ぐために、設定２の起動条件を付け、到達したい話題がなくなっている（到達済みになっている）話題モジュールを判別する。 The above explanation is about scenario SC1, but the topic module set creation unit 120 performs similar processing for other scenarios SC (for example, scenario SC2 and scenario SC3). As a result, additional activation conditions are set as shown in Figures 11 to 13. The purpose of setting 2 is to not select a topic module in which the topic to be reached has already been reached. For example, the topic "Are you hungry?" is thought to make the user recall that the system has the next topic related to "are you hungry = YES or NO". Therefore, when responding to "Are you hungry?", if the next topic that is not related to "are you hungry" follows, the user is likely to wonder why the topic "are you hungry?" was asked. To prevent such a situation, the activation condition of setting 2 is added, and topic modules in which the topic to be reached no longer exists (has already been reached) are identified.

次に話題モジュールセット作成部１２０は、起動条件の設定３として、図１４～図１６に示すように、各シナリオＳＣを構成する各話題のうち２個目以降の話題に対応付けられている起動条件で定義されている条件を追加で設定する。起動条件の設定３の目的は、起動条件が対応付けられている２個目以降の話題を話す根拠となる定義を設定することである。２個目以降の話題に対応する内容を出力するためには、直前の話題に対して対話の相手から肯定的な回答（例えば、「はい」や「空いている」等）が得られることが条件となる。 Next, the topic module set creation unit 120 additionally sets conditions defined in the start conditions associated with the second and subsequent topics among the topics constituting each scenario SC as start condition setting 3, as shown in Figures 14 to 16. The purpose of start condition setting 3 is to set a definition that serves as the basis for talking about the second and subsequent topics associated with the start conditions. In order to output content corresponding to the second and subsequent topics, a condition is that a positive response (for example, "yes" or "free") is obtained from the conversation partner to the previous topic.

例えば、シナリオＳＣ１を構成する２個目の話題である「ラーメン食べたい？」という話題に対応する内容を出力するためには、シナリオＳＣ１を構成する１個目の話題である「お腹空いている？」という話題に対して対話の相手から肯定的な回答（例えば、「はい」や「空いている」等）が得られることが条件となる。さらに、シナリオＳＣ１を構成する３個目の話題である「ラーメン店紹介ＯＫ？」という話題に対応する内容を出力するためには、シナリオＳＣ１を構成する２個目の話題である「ラーメン食べたい？」という話題に対して対話の相手から肯定的な回答（例えば、「はい」や「空いている」等）が得られることが条件となる。さらに、シナリオＳＣ１を構成する４個目の話題である「ラーメン店を推薦する」という話題に対応する内容を出力するためには、シナリオＳＣ１を構成する３個目の話題である「ラーメン店紹介ＯＫ？」という話題に対して対話の相手から肯定的な回答（例えば、「はい」や「空いている」等）が得られることが条件となる。そこで、話題モジュールセット作成部１２０は、２個目以降の話題に対応付けられている起動条件で定義されている内容のうち、直前の話題に関する定義内容を「ＹＥＳ」（直前の話題＝＝ＹＥＳ）と追加で設定する。 For example, in order to output content corresponding to the second topic "Do you want to eat ramen?" in scenario SC1, a condition is that a positive response (e.g., "yes" or "it's open") is obtained from the conversation partner to the first topic "Are you hungry?" in scenario SC1. Furthermore, in order to output content corresponding to the third topic "Can I introduce a ramen shop?" in scenario SC1, a condition is that a positive response (e.g., "yes" or "it's open") is obtained from the conversation partner to the second topic "Do you want to eat ramen?" in scenario SC1. Furthermore, in order to output content corresponding to the fourth topic "Recommend a ramen shop" in scenario SC1, a condition is that a positive response (e.g., "yes" or "it's open") is obtained from the conversation partner to the third topic "Can I introduce a ramen shop?" in scenario SC1. Therefore, the topic module set creation unit 120 additionally sets the definition content related to the previous topic among the contents defined in the start conditions associated with the second and subsequent topics to "YES" (previous topic == YES).

話題モジュールセット作成部１２０は、例えば、図１４に示す「ラーメン食べたい？」という話題に対応付けられている起動条件の定義の内容うち、直前の話題である「お腹空いている？」という内容を“お腹空いている＝＝Ｕｎｋｎｏｗｎ”を“お腹空いている＝＝Ｙｅｓ”と追加で設定する。これにより、直前の話題である「お腹空いている？」という話題に対して対話の相手から肯定的な回答が得られた場合に、「ラーメン食べたい？」という話題に対応する内容を出力するための起動条件が満たされるように設定することができる。 For example, the topic module set creation unit 120 additionally sets the content of the previous topic "Are you hungry?" from "hungry == Unknown" to "hungry == Yes" in the content of the definition of the activation condition associated with the topic "Want to eat ramen?" shown in FIG. 14. This makes it possible to set the activation condition for outputting the content corresponding to the topic "Want to eat ramen?" to be satisfied when a positive response is obtained from the conversation partner to the previous topic "Are you hungry?".

同様に、話題モジュールセット作成部１２０は、例えば、図１４に示す「ラーメン店紹介ＯＫ？」という話題に対応付けられている起動条件の定義の内容うち、直前の話題である「ラーメン食べたい？」という内容を“ラーメン食べたい＝＝Ｕｎｋｎｏｗｎ”を“ラーメン食べたい＝＝Ｙｅｓ”と追加で設定する。これにより、直前の話題である「ラーメン食べたい？」という話題に対して対話の相手から肯定的な回答が得られた場合に、「ラーメン店紹介ＯＫ？」という話題に対応する内容を出力するための起動条件が満たされるように設定することができる。 Similarly, the topic module set creation unit 120 additionally sets the content of the previous topic "Want to eat ramen?" from "Want to eat ramen == Unknown" to "Want to eat ramen == Yes" in the content of the definition of the activation condition associated with the topic "Can I recommend a ramen shop?" shown in FIG. 14. This makes it possible to set the activation condition to be satisfied for outputting the content corresponding to the topic "Can I recommend a ramen shop?" when a positive response is obtained from the conversation partner for the previous topic "Want to eat ramen?".

同様に、話題モジュールセット作成部１２０は、例えば、図１４に示す「ラーメン店を推薦する」という話題に対応付けられている起動条件の定義の内容うち、直前の話題である「ラーメン店紹介ＯＫ？」という内容を“ラーメン店紹介ＯＫ＝＝Ｕｎｋｎｏｗｎ”を“ラーメン店紹介ＯＫ＝＝Ｙｅｓ”と追加で設定する。これにより、直前の話題である「ラーメン店紹介ＯＫ？」という話題に対して対話の相手から肯定的な回答が得られた場合に、「ラーメン店を推薦する」という話題に対応する内容を出力するための起動条件が満たされるように設定することができる。 Similarly, the topic module set creation unit 120 additionally sets the content of the previous topic "Ramen shop introduction OK?" in the content of the definition of the activation condition associated with the topic "recommending a ramen shop" shown in FIG. 14, from "Ramen shop introduction OK==Unknown" to "Ramen shop introduction OK==Yes". This makes it possible to set the activation condition to be satisfied for outputting the content corresponding to the topic "recommending a ramen shop" when a positive response is obtained from the conversation partner for the previous topic "Ramen shop introduction OK?".

以上の説明はシナリオＳＣ１に関する内容であるが、話題モジュールセット作成部１２０は同様の処理を他のシナリオＳＣ（例えば、シナリオＳＣ２及びシナリオＳＣ３）に対しても行う。これにより、図１４～図１６に示すように起動条件で定義された内容を追加で設定することができる。以上のように、話題モジュールセット作成部１２０は、起動条件の設定１～設定３の処理を行うことによって各話題モジュールに対して起動条件を設定する。 The above explanation is about scenario SC1, but the topic module set creation unit 120 also performs similar processing for other scenarios SC (e.g., scenario SC2 and scenario SC3). This makes it possible to set additional content defined in the start-up conditions as shown in Figures 14 to 16. As described above, the topic module set creation unit 120 sets start-up conditions for each topic module by performing processes for start-up condition settings 1 to 3.

次に、話題モジュールセット作成部１２０は、図１７～図１９に示すように、各話題モジュールに対して出力音声文字列を設定する。これは、話題モジュールの起動条件が満たされた場合に、対話装置１０が音声で出力するための文字列である。話題モジュールセット作成部１２０は、対話の文脈に応じた出力音声文字列を自動生成してもよいし、設計者等がシナリオＳＣと一緒に作成した文字列を用いて設定してもよい。出力音声文字列を自動生成する場合には、言語モデルや生成モデルが用いられてもよい。設計者等がシナリオＳＣと一緒に作成した文字列を用いる場合には、設計者はシナリオＳＣを構成する各話題に対応付けて文字列を作成する。話題モジュールセット作成部１２０は、各話題に対応付けて文字列をそのまま出力音声文字列として各話題モジュールに対して設定する。 Next, the topic module set creation unit 120 sets an output voice string for each topic module, as shown in Figures 17 to 19. This is a string that the dialogue device 10 outputs by voice when the activation condition of the topic module is satisfied. The topic module set creation unit 120 may automatically generate an output voice string according to the context of the dialogue, or may set it using a string created by a designer or the like together with the scenario SC. When automatically generating an output voice string, a language model or a generative model may be used. When using a string created by a designer or the like together with the scenario SC, the designer creates a string in association with each topic that constitutes the scenario SC. The topic module set creation unit 120 sets the string directly as an output voice string for each topic module in association with each topic.

話題モジュールセット作成部１２０は、図８～図１９で説明した処理が終了すると、所定の優先順位で複数の話題モジュールを階層構造に配置することによって、起動条件と話題モジュールと出力音声文字列が対応付けられたリストを生成する。話題モジュールセット作成部１２０は、生成したリストのうち、話題モジュールの内容と、出力音声文字列との組み合わせを出力言語情報として作成する。話題モジュールセット作成部１２０は、作成した出力言語情報を記憶部１１に記憶する。さらに、話題モジュールセット作成部１２０は、生成したリストのうち、起動条件と、話題モジュールとの組み合わせを話題モジュールセットとして作成する。話題モジュールセット作成部１２０は、作成した話題モジュールセットを記憶部１１に記憶する。なお、以下の説明では、説明の簡単化のため、シナリオＳＣ１及びシナリオＳＣ２の内容のみで作成した話題モジュールセットを用いて説明する。なお、必要に応じてシナリオＳＣ３を用いて作成した話題モジュールセットで説明する場合もある。 When the process described in FIG. 8 to FIG. 19 is completed, the topic module set creation unit 120 generates a list in which the activation conditions, topic modules, and output voice strings are associated with each other by arranging multiple topic modules in a hierarchical structure in a predetermined priority order. The topic module set creation unit 120 creates a combination of the content of the topic module and the output voice string from the generated list as output language information. The topic module set creation unit 120 stores the created output language information in the storage unit 11. Furthermore, the topic module set creation unit 120 creates a combination of the activation conditions and topic modules from the generated list as a topic module set. The topic module set creation unit 120 stores the created topic module set in the storage unit 11. In the following explanation, for simplicity, a topic module set created only from the content of scenarios SC1 and SC2 will be used. In addition, a topic module set created using scenario SC3 may be used as necessary.

［対話システム１００の処理（その１）］
図２０は、実施形態における対話システム１００の処理の流れを示すシーケンス図（その１）である。なお、図２０の処理の説明では、話題モジュール１１３－ｎが図３に示す階層構造で配置されているものとする。図２０において、ユーザから対話装置１０に向かう矢印上の文字列は対話装置１０が出力する音声であり、対話装置１０からユーザに向かう矢印上の文字列は対話装置１０が解析したユーザの発話内容である。さらに、図２０の処理開始時の状態情報１１２で示される各状態の変数は初期値（例えば、Ｕ）であるものとする。 [Processing of the dialogue system 100 (part 1)]
Fig. 20 is a sequence diagram (part 1) showing the flow of processing of the dialogue system 100 in the embodiment. In the explanation of the processing of Fig. 20, it is assumed that the topic modules 113-n are arranged in the hierarchical structure shown in Fig. 3. In Fig. 20, the character string on the arrow pointing from the user to the dialogue device 10 is the voice output by the dialogue device 10, and the character string on the arrow pointing from the dialogue device 10 to the user is the content of the user's utterance analyzed by the dialogue device 10. Furthermore, it is assumed that the variables of each state indicated in the state information 112 at the start of the processing of Fig. 20 are initial values (e.g., U).

対話装置１０の話題決定部１２６は、処理開始時において検出部１２２によりユーザの行動が検知されて状態更新部１２５により状態情報１１２の情報が更新されたことを契機に、状態情報１１２と話題モジュールセット１１３とに基づいて話題を決定する。図２０の処理開始時の状態情報１１２で示される各状態の変数は、図２１（Ａ）に示す通り“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の高い話題モジュール１１３－ｎを選択する。例えば、話題決定部１２６は、図２１（Ａ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、各候補状態が“Ｕ”）の組み合わせであって、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－７を選択する。話題決定部１２６は、選択した話題モジュール１１３－７における話題（例えば、“お腹空いているか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 The topic determination unit 126 of the dialogue device 10 determines a topic based on the state information 112 and the topic module set 113 when the detection unit 122 detects the user's behavior at the start of processing and the state update unit 125 updates the information of the state information 112. The variables of each state shown in the state information 112 at the start of processing in FIG. 20 are "U" as shown in FIG. 21 (A). The topic determination unit 126 refers to the topic module set 113 and selects a topic module 113-n that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112 and has a high priority. For example, the topic determination unit 126 selects the topic module 113-7 with the highest priority among the topic modules 113-n that include a combination of variables (for example, each candidate state is "U") associated with each candidate state in the state information 112 shown in FIG. 21 (A) and that satisfies the combination of variables defined as the start condition. The topic determination unit 126 determines the topic in the selected topic module 113-7 (e.g., "ask if you're hungry") as the topic to be output. The topic determination unit 126 outputs information about the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“お腹空いているか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“お腹空いているか聞く”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“お腹空いてない？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“お腹空いてない？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“お腹空いてない？”という音声がスピーカー４０から出力される（ステップＳ１０１）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "Ask if you're hungry") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "Ask if you're hungry"). In the example shown in FIG. 4, the output language information 114 selects "Are you hungry?" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "Are you hungry?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the speech "Are you hungry?" is output from the speaker 40 (step S101).

ユーザは、スピーカー４０から出力された音声に応じて、“空いた”と発話したとする（ステップＳ１０２）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“空いた”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user responds to the voice output from the speaker 40 by uttering "it's free" (step S102). The voice uttered by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice uttered by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed by the analysis unit 124 through natural language processing. As a result, it is analyzed that the user has uttered "it's free". The status update unit 125 updates the status information 112 based on the content of the utterance analyzed by the analysis unit 124.

具体的には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報と、解析部１２４により解析された発話内容とを踏まえて、状態情報１１２で示される候補状態の中から該当する候補状態を選択する。例えば、話題決定部１２６から出力された話題に関する情報が“お腹空いているか聞く”であり、解析部１２４により解析された発話内容が“空いた”である。そこで、状態更新部１２５は、該当する候補状態として“お腹空いている”を選択する。なお、話題モジュール１１３－ｎにおける話題と、状態情報１１２で示される各候補状態とが予め対応付けられている場合には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報に対応付けられている候補状態を選択すればよい。 Specifically, the state update unit 125 selects a corresponding candidate state from the candidate states indicated in the state information 112, based on the information on the topic output from the topic determination unit 126 and the utterance content analyzed by the analysis unit 124. For example, the information on the topic output from the topic determination unit 126 is "asking if you're hungry," and the utterance content analyzed by the analysis unit 124 is "empty." Therefore, the state update unit 125 selects "hungry" as the corresponding candidate state. Note that, if the topic in the topic module 113-n is associated in advance with each candidate state indicated in the state information 112, the state update unit 125 may select the candidate state associated with the information on the topic output from the topic determination unit 126.

状態更新部１２５は、図２１（Ｂ）に示すように、状態情報１１２で示される状態“お腹空いている”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２１（Ｂ）に示す通り、状態情報１１２で示される状態“お腹空いている”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 As shown in FIG. 21(B), the state update unit 125 updates the variable "U" associated with the state "hungry" shown in the state information 112 to "Y". When the state update unit 125 updates the state information 112, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 21(B), the variables of each state shown in the state information 112 at this point are "Y" for the state "hungry" shown in the state information 112, and "U" for other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112 and has the highest priority.

例えば、話題決定部１２６は、図２１（Ｂ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態“お腹空いている”が“Ｙ”であり、それ以外の状態は“Ｕ”）の組み合わせであって、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－５を選択する。話題決定部１２６は、選択した話題モジュール１１３－５における話題（例えば、“ラーメン食べたいか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 selects the topic module 113-5 with the highest priority among the topic modules 113-n that contain a condition that satisfies the combination of variables defined as the activation condition and is a combination of variables associated with each candidate state in the state information 112 shown in FIG. 21 (B) (for example, the candidate state "hungry" is "Y" and other states are "U"). The topic determination unit 126 determines the topic in the selected topic module 113-5 (for example, "asking if the person wants to eat ramen") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“ラーメン食べたいか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“ラーメン食べたいか聞く”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“お腹空いているなら、ラーメン食べたくない？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“お腹空いているなら、ラーメン食べたくない？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“お腹空いているなら、ラーメン食べたくない？”という音声がスピーカー４０から出力される（ステップＳ１０３）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "ask if you want to eat ramen") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "ask if you want to eat ramen"). In the example shown in FIG. 4, the output language information 114 selects "If you're hungry, don't you want to eat ramen?" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "If you're hungry, don't you want to eat ramen?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the speech "If you're hungry, don't you want to eat ramen?" is output from the speaker 40 (step S103).

ユーザは、スピーカー４０から出力された音声に応じて、“食べたい”と発話したとする（ステップＳ１０４）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“食べたい”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user utters "I want to eat" in response to the voice output from the speaker 40 (step S104). The voice uttered by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice uttered by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed by the analysis unit 124 through natural language processing. As a result, it is analyzed that the user uttered "I want to eat". The state update unit 125 updates the state information 112 based on the content of the utterance analyzed by the analysis unit 124.

具体的には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報と、解析部１２４により解析された発話内容とを踏まえて、状態情報１１２で示される候補状態の中から該当する候補状態を選択する。例えば、話題決定部１２６から出力された話題に関する情報が“ラーメン食べたいか聞く”であり、解析部１２４により解析された発話内容が“食べたい”である。そこで、状態更新部１２５は、該当する候補状態として“ラーメン食べたい”を選択する。なお、話題モジュール１１３－ｎにおける話題と、状態情報１１２で示される各候補状態とが予め対応付けられている場合には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報に対応付けられている候補状態を選択すればよい。 Specifically, the state update unit 125 selects a corresponding candidate state from the candidate states indicated in the state information 112, based on the information on the topic output from the topic determination unit 126 and the speech content analyzed by the analysis unit 124. For example, the information on the topic output from the topic determination unit 126 is "asking if you want to eat ramen", and the speech content analyzed by the analysis unit 124 is "want to eat". Then, the state update unit 125 selects "want to eat ramen" as the corresponding candidate state. Note that, if the topic in the topic module 113-n is associated in advance with each candidate state indicated in the state information 112, the state update unit 125 may select the candidate state associated with the information on the topic output from the topic determination unit 126.

状態更新部１２５は、図２１（Ｃ）に示すように、状態情報１１２で示される状態“ラーメン食べたい”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２１（Ｃ）に示す通り、状態情報１１２で示される状態“お腹空いている”及び“ラーメン食べたい”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 The state update unit 125 updates the variable "U" associated with the state "I want to eat ramen" shown in the state information 112 to "Y" as shown in FIG. 21 (C). When the state update unit 125 updates the state information 112, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 21 (C), the variables of each state shown in the state information 112 at this point are "Y" for the states "I'm hungry" and "I want to eat ramen" shown in the state information 112, and "U" for the other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112 and has the highest priority.

例えば、話題決定部１２６は、図２１（Ｃ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態“お腹空いている”及び“ラーメン食べたい”が“Ｙ”であり、それ以外の状態は“Ｕ”）の組み合わせであって、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－３を選択する。話題決定部１２６は、選択した話題モジュール１１３－３における話題（例えば、“ラーメン店紹介ＯＫか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 selects the topic module 113-3 with the highest priority among the topic modules 113-n that contain a condition that satisfies the combination of variables defined as the activation condition and is a combination of variables associated with each candidate state in the state information 112 shown in FIG. 21 (C) (for example, the candidate states "I'm hungry" and "I want to eat ramen" are "Y", and other states are "U"). The topic determination unit 126 determines the topic in the selected topic module 113-3 (for example, "Ask if it's OK to introduce a ramen shop") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“ラーメン店紹介ＯＫか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“ラーメン店紹介ＯＫか聞く”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“ラーメン食べたいなら、ラーメン店紹介してもいい？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“ラーメン食べたいなら、ラーメン店紹介してもいい？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“ラーメン食べたいなら、ラーメン店紹介してもいい？”という音声がスピーカー４０から出力される（ステップＳ１０５）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "Ask if it's OK to introduce a ramen shop") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "Ask if it's OK to introduce a ramen shop"). In the example shown in FIG. 4, the output language information 114 selects "If you want to eat ramen, can I introduce you to a ramen shop?" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "If you want to eat ramen, can I introduce you to a ramen shop?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the speech "If you want to eat ramen, can I introduce you to a ramen shop?" is output from the speaker 40 (step S105).

ユーザは、スピーカー４０から出力された音声に応じて、“いいよ”と発話したとする（ステップＳ１０６）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“いいよ”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user responds to the voice output from the speaker 40 by saying "good" (step S106). The voice spoken by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice spoken by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed through natural language processing by the analysis unit 124. As a result, it is analyzed that the user has said "good". The state update unit 125 updates the state information 112 based on the content of the utterance analyzed by the analysis unit 124.

具体的には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報と、解析部１２４により解析された発話内容とを踏まえて、状態情報１１２で示される候補状態の中から該当する候補状態を選択する。例えば、話題決定部１２６から出力された話題に関する情報が“ラーメン店紹介ＯＫか聞く”であり、解析部１２４により解析された発話内容が“いいよ”である。そこで、状態更新部１２５は、該当する候補状態として“ラーメン店紹介ＯＫ”を選択する。なお、話題モジュール１１３－ｎにおける話題と、状態情報１１２で示される各候補状態とが予め対応付けられている場合には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報に対応付けられている候補状態を選択すればよい。 Specifically, the state update unit 125 selects a corresponding candidate state from the candidate states indicated in the state information 112, based on the information on the topic output from the topic determination unit 126 and the speech content analyzed by the analysis unit 124. For example, the information on the topic output from the topic determination unit 126 is "Ask if it's OK to introduce a ramen shop," and the speech content analyzed by the analysis unit 124 is "Sure." Therefore, the state update unit 125 selects "Ramen shop introduction OK" as the corresponding candidate state. Note that, if the topic in the topic module 113-n is associated in advance with each candidate state indicated in the state information 112, the state update unit 125 may select the candidate state associated with the information on the topic output from the topic determination unit 126.

状態更新部１２５は、図２１（Ｄ）に示すように、状態情報１１２で示される状態“ラーメン店紹介ＯＫ”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２１（Ｄ）に示す通り、状態情報１１２で示される状態“お腹空いている”、“ラーメン食べたい”及び“ラーメン店紹介ＯＫ”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 The state update unit 125 updates the variable "U" associated with the state "Ramen restaurant introduction OK" shown in the state information 112 to "Y" as shown in FIG. 21 (D). When the state information 112 is updated by the state update unit 125, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 21 (D), the variables of each state shown in the state information 112 at this point are "Y" for the states "I'm hungry", "I want to eat ramen", and "Ramen restaurant introduction OK" shown in the state information 112, and "U" for the other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112.

例えば、話題決定部１２６は、図２１（Ｄ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態“お腹空いている”、“ラーメン食べたい”及び“ラーメン店紹介ＯＫ”が“Ｙ”であり、それ以外の状態は“Ｕ”）の組み合わせであって、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－１を選択する。話題決定部１２６は、選択した話題モジュール１１３－１における話題（例えば、“ラーメン店を推薦する”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 selects the topic module 113-1 with the highest priority among the topic modules 113-n that contain a condition that satisfies the combination of variables defined as the activation condition and is a combination of variables associated with each candidate state in the state information 112 shown in FIG. 21 (D) (for example, the candidate states "hungry", "want to eat ramen", and "ramen restaurant recommendation OK" are "Y", and other states are "U"). The topic determination unit 126 determines the topic in the selected topic module 113-1 (for example, "recommend a ramen restaurant") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“ラーメン店を推薦する”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“ラーメン店を推薦する”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“ラーメン店紹介ＯＫなら、〇〇っていうラーメン屋がおすすめ”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“ラーメン店紹介ＯＫなら、〇〇っていうラーメン屋がおすすめ”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“ラーメン店紹介ＯＫなら、〇〇っていうラーメン屋がおすすめ”という音声がスピーカー４０から出力される（ステップＳ１０７）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "recommend a ramen shop") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "recommend a ramen shop"). In the example shown in FIG. 4, the output language information 114 selects "If you are OK with introducing a ramen shop, I recommend a ramen shop called XX" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "If you are OK with introducing a ramen shop, I recommend a ramen shop called XX" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, a voice saying "If you are OK with introducing a ramen shop, I recommend a ramen shop called XX" is output from the speaker 40 (step S107).

ユーザは、スピーカー４０から出力された音声に応じて、“分かった”と発話したとする（ステップＳ１０８）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“分かった”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user responds to the voice output from the speaker 40 by uttering "I understand" (step S108). The voice uttered by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice uttered by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed through natural language processing by the analysis unit 124. As a result, it is analyzed that the user has uttered "I understand". The state update unit 125 updates the state information 112 based on the content of the utterance analyzed by the analysis unit 124.

具体的には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報と、解析部１２４により解析された発話内容とを踏まえて、状態情報１１２で示される候補状態の中から該当する候補状態を選択する。例えば、話題決定部１２６から出力された話題に関する情報が“ラーメン店を推薦する”であり、解析部１２４により解析された発話内容が“分かった”である。そこで、状態更新部１２５は、該当する候補状態として“ラーメン店推薦聞いた”を選択する。なお、話題モジュール１１３－ｎにおける話題と、状態情報１１２で示される各候補状態とが予め対応付けられている場合には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報に対応付けられている候補状態を選択すればよい。 Specifically, the state update unit 125 selects a corresponding candidate state from the candidate states indicated in the state information 112, based on the information on the topic output from the topic determination unit 126 and the speech content analyzed by the analysis unit 124. For example, the information on the topic output from the topic determination unit 126 is "recommend a ramen shop", and the speech content analyzed by the analysis unit 124 is "I understand". Then, the state update unit 125 selects "I heard a ramen shop recommended" as the corresponding candidate state. Note that, if the topic in the topic module 113-n is associated in advance with each candidate state indicated in the state information 112, the state update unit 125 may select the candidate state associated with the information on the topic output from the topic determination unit 126.

状態更新部１２５は、図２１（Ｅ）に示すように、状態情報１１２で示される状態“ラーメン店推薦聞いた”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２１（Ｅ）に示す通り、状態情報１１２で示される状態“お腹空いている”、“ラーメン食べたい”、“ラーメン店紹介ＯＫ”及び“ラーメン店推薦聞いた”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 The state update unit 125 updates the variable "U" associated with the state "I heard a ramen shop recommendation" shown in the state information 112 to "Y" as shown in FIG. 21 (E). When the state information 112 is updated by the state update unit 125, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 21 (E), the variables of each state shown in the state information 112 at this point are "Y" for the states "I'm hungry", "I want to eat ramen", "I'm OK with a ramen shop recommendation" and "I heard a ramen shop recommendation" shown in the state information 112, and "U" for the other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112.

ところが、現時点においては選択可能な話題がない。この場合、対話装置１０はユーザとの対話を終了する。なお、状態更新部１２５は、対話終了時には、状態情報１１２で示される各状態における変数を全て初期化する。これにより、新たなユーザと対話する際においても対応可能になる。 However, at the current time, there are no selectable topics. In this case, the dialogue device 10 ends the dialogue with the user. Note that, when the dialogue ends, the state update unit 125 initializes all variables in each state indicated by the state information 112. This makes it possible to respond when a dialogue with a new user occurs.

図２２は、話題モジュールセット１１３の一例（その２）を示す図である。図２２に示す例では、全ての話題モジュール１１３－ｎを、ゴールから遠い距離（ゴールまでに経由する話題の数）順に並べている。このような順に並べることでできるだけ多くユーザの状態に関する情報を取得することができる。 Figure 22 is a diagram showing an example (part 2) of a topic module set 113. In the example shown in Figure 22, all topic modules 113-n are sorted in order of distance from the goal (the number of topics to be passed through on the way to the goal). By sorting in this way, it is possible to obtain as much information as possible about the user's state.

［対話システム１００の処理（その２）］
図２３は、実施形態における対話システム１００の処理の流れを示すシーケンス図（その２）である。なお、図２３の処理の説明では、話題モジュール１１３－ｎが図２２に示す階層構造で配置されているものとする。図２３において、ユーザから対話装置１０に向かう矢印上の文字列は対話装置１０が出力する音声であり、対話装置１０からユーザに向かう矢印上の文字列は対話装置１０が解析したユーザの発話内容である。さらに、図２３の処理開始時の状態情報１１２で示される各状態の変数は初期値であるものとする。 [Processing of the dialogue system 100 (part 2)]
Fig. 23 is a sequence diagram (part 2) showing the flow of processing of the dialogue system 100 in the embodiment. In the explanation of the processing of Fig. 23, it is assumed that the topic modules 113-n are arranged in the hierarchical structure shown in Fig. 22. In Fig. 23, the character string on the arrow pointing from the user to the dialogue device 10 is the voice output by the dialogue device 10, and the character string on the arrow pointing from the dialogue device 10 to the user is the content of the user's utterance analyzed by the dialogue device 10. Furthermore, it is assumed that the variables of each state indicated in the state information 112 at the start of processing of Fig. 23 are initial values.

対話装置１０の話題決定部１２６は、検出部１２２によりユーザの行動が検知されて状態更新部１２５により状態情報１１２の情報が更新されたことを契機に、状態情報１１２と話題モジュールセット１１３とに基づいて話題を決定する。図２３の処理開始時の状態情報１１２で示される各状態の変数は、図２４（Ａ）に示す通り“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。例えば、話題決定部１２６は、図２４（Ａ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、各候補状態が“Ｕ”）の組み合わせを参照して、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－９を選択する。話題決定部１２６は、選択した話題モジュール１１３－９における話題（例えば、“疲れているか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 The topic determination unit 126 of the dialogue device 10 determines a topic based on the state information 112 and the topic module set 113 when the detection unit 122 detects the user's behavior and the state update unit 125 updates the state information 112. The variables of each state shown in the state information 112 at the start of processing in FIG. 23 are "U" as shown in FIG. 24(A). The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112 and has the highest priority. For example, the topic determination unit 126 refers to the combination of variables (e.g., each candidate state is "U") associated with each candidate state in the state information 112 shown in FIG. 24(A) and selects the topic module 113-9 with the highest priority among the topic modules 113-n that include a condition that satisfies the combination of variables defined as the activation condition. The topic determination unit 126 determines the topic in the selected topic module 113-9 (e.g., "Ask if you are tired") as the topic to be output. The topic determination unit 126 outputs information about the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“疲れているか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“疲れているか聞く”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“疲れてない？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“疲れてない？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“疲れてない？”という音声がスピーカー４０から出力される（ステップＳ２０１）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "Ask if you are tired") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "Ask if you are tired"). In the example shown in FIG. 4, the output language information 114 selects "Aren't you tired?" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "Aren't you tired?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the speech "Aren't you tired?" is output from the speaker 40 (step S201).

ユーザは、スピーカー４０から出力された音声に応じて、“疲れている”と発話したとする（ステップＳ２０２）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“疲れている”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user utters "I'm tired" in response to the voice output from the speaker 40 (step S202). The voice uttered by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice uttered by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed by the analysis unit 124 through natural language processing. As a result, it is analyzed that the user has uttered "I'm tired". The status update unit 125 updates the status information 112 based on the content of the utterance analyzed by the analysis unit 124.

具体的には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報と、解析部１２４により解析された発話内容とを踏まえて、状態情報１１２で示される候補状態の中から該当する候補状態を選択する。例えば、話題決定部１２６から出力された話題に関する情報が“疲れているか聞く”であり、解析部１２４により解析された発話内容が“疲れている”である。そこで、状態更新部１２５は、該当する候補状態として“疲れている”を選択する。なお、話題モジュール１１３－ｎにおける話題と、状態情報１１２で示される各候補状態とが予め対応付けられている場合には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報に対応付けられている候補状態を選択すればよい。 Specifically, the state update unit 125 selects a corresponding candidate state from the candidate states indicated in the state information 112, based on the information on the topic output from the topic determination unit 126 and the speech content analyzed by the analysis unit 124. For example, the information on the topic output from the topic determination unit 126 is "ask if you are tired", and the speech content analyzed by the analysis unit 124 is "tired". Then, the state update unit 125 selects "tired" as the corresponding candidate state. Note that, if the topic in the topic module 113-n is associated in advance with each candidate state indicated in the state information 112, the state update unit 125 may select the candidate state associated with the information on the topic output from the topic determination unit 126.

状態更新部１２５は、図２４（Ｂ）に示すように、状態情報１１２で示される状態“疲れている”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２４（Ｂ）に示す通り、状態情報１１２で示される状態“疲れている”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 As shown in FIG. 24(B), the state update unit 125 updates the variable "U" associated with the state "tired" shown in the state information 112 to "Y". When the state update unit 125 updates the state information 112, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 24(B), the variables of each state shown in the state information 112 at this point are "Y" for the state "tired" shown in the state information 112, and "U" for other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition satisfied by a combination of variables associated with each candidate state in the state information 112.

例えば、話題決定部１２６は、図２４（Ｂ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態“疲れている”が“Ｙ”であり、それ以外の状態は“Ｕ”）の組み合わせを参照して、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－８を選択する。話題決定部１２６は、選択した話題モジュール１１３－８における話題（例えば、“お腹空いているか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 refers to the combination of variables associated with each candidate state in the state information 112 shown in FIG. 24(B) (for example, the candidate state "tired" is "Y" and other states are "U"), and selects the topic module 113-8 with the highest priority among the topic modules 113-n that include a condition that satisfies the combination of variables defined as the activation condition. The topic determination unit 126 determines the topic in the selected topic module 113-8 (for example, "asking if you're hungry") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“お腹空いているか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“お腹空いているか聞く”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“お腹空いてない？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“お腹空いてない？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“お腹空いてない？”という音声がスピーカー４０から出力される（ステップＳ２０３）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "Ask if you're hungry") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "Ask if you're hungry"). In the example shown in FIG. 4, the output language information 114 selects "Are you hungry?" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "Are you hungry?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the speech "Are you hungry?" is output from the speaker 40 (step S203).

ユーザは、スピーカー４０から出力された音声に応じて、“空いた”と発話したとする（ステップＳ２０４）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“空いた”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user responds to the voice output from the speaker 40 by uttering "it's free" (step S204). The voice uttered by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice uttered by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed by the analysis unit 124 through natural language processing. As a result, it is analyzed that the user has uttered "it's free". The status update unit 125 updates the status information 112 based on the content of the utterance analyzed by the analysis unit 124.

状態更新部１２５は、図２４（Ｃ）に示すように、状態情報１１２で示される状態“お腹空いている”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２４（Ｃ）に示す通り、状態情報１１２で示される状態“疲れている”及び“お腹空いている”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 The state update unit 125 updates the variable "U" associated with the state "hungry" shown in the state information 112 to "Y" as shown in FIG. 24(C). When the state information 112 is updated by the state update unit 125, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 24(C), the variables of each state shown in the state information 112 at this point are "Y" for the states "tired" and "hungry" shown in the state information 112, and "U" for other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112 and has the highest priority.

例えば、話題決定部１２６は、図２４（Ｃ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態“疲れている”及び“お腹空いている”が“Ｙ”であり、それ以外の状態は“Ｕ”）の組み合わせを参照して、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－５を選択する。話題決定部１２６は、選択した話題モジュール１１３－５における話題（例えば、“ラーメン食べたいか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 refers to the combination of variables associated with each candidate state in the state information 112 shown in FIG. 24(C) (for example, the candidate states "tired" and "hungry" are "Y", and other states are "U"), and selects the topic module 113-5 with the highest priority among the topic modules 113-n that include a condition that satisfies the combination of variables defined as the activation condition. The topic determination unit 126 determines the topic in the selected topic module 113-5 (for example, "asking if the person wants to eat ramen") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“ラーメン食べたいか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“ラーメン食べたいか聞く”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“お腹空いているなら、ラーメン食べたくない？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“お腹空いているなら、ラーメン食べたくない？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“お腹空いているなら、ラーメン食べたくない？”という音声がスピーカー４０から出力される（ステップＳ２０５）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "ask if you want to eat ramen") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "ask if you want to eat ramen"). In the example shown in FIG. 4, the output language information 114 selects "If you're hungry, don't you want to eat ramen?" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "If you're hungry, don't you want to eat ramen?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the speech "If you're hungry, don't you want to eat ramen?" is output from the speaker 40 (step S205).

ユーザは、スピーカー４０から出力された音声に応じて、“食べたい”と発話したとする（ステップＳ２０６）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“食べたい”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user utters "I want to eat" in response to the voice output from the speaker 40 (step S206). The voice uttered by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice uttered by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed by the analysis unit 124 through natural language processing. As a result, it is analyzed that the user uttered "I want to eat". The state update unit 125 updates the state information 112 based on the content of the utterance analyzed by the analysis unit 124.

状態更新部１２５は、図２４（Ｄ）に示すように、状態情報１１２で示される状態“ラーメン食べたい”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２４（Ｄ）に示す通り、状態情報１１２で示される状態“疲れている”、“お腹空いている”及び“ラーメン食べたい”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 The state update unit 125 updates the variable "U" associated with the state "I want to eat ramen" shown in the state information 112 to "Y" as shown in FIG. 24 (D). When the state information 112 is updated by the state update unit 125, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 24 (D), the variables of each state shown in the state information 112 at this point are "Y" for the states "Tired", "Hungry" and "I want to eat ramen" shown in the state information 112, and "U" for the other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition satisfied by a combination of variables associated with each candidate state in the state information 112.

例えば、話題決定部１２６は、図２４（Ｄ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態“疲れている”、“お腹空いている”及び“ラーメン食べたい”が“Ｙ”であり、それ以外の状態は“Ｕ”）の組み合わせを参照して、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－６を選択する。話題決定部１２６は、選択した話題モジュール１１３－６における話題（例えば、“パスタ食べたいか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 refers to the combination of variables associated with each candidate state in the state information 112 shown in FIG. 24 (D) (for example, the candidate states "tired", "hungry", and "want to eat ramen" are "Y", and other states are "U"), and selects the topic module 113-6 with the highest priority among the topic modules 113-n that include a condition that satisfies the combination of variables defined as the activation condition. The topic determination unit 126 determines the topic in the selected topic module 113-6 (for example, "asking if the person wants to eat pasta") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“パスタ食べたいか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“パスタ食べたいか聞く”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“お腹空いているなら、パスタ食べたくない？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“お腹空いているなら、パスタ食べたくない？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“お腹空いているなら、パスタ食べたくない？”という音声がスピーカー４０から出力される（ステップＳ２０７）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "ask if you want to eat pasta") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "ask if you want to eat pasta"). In the example shown in FIG. 4, the output language information 114 selects "If you're hungry, don't you want to eat pasta?" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "If you're hungry, don't you want to eat pasta?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the speech "If you're hungry, don't you want to eat pasta?" is output from the speaker 40 (step S207).

ユーザは、スピーカー４０から出力された音声に応じて、“食べたい”と発話したとする（ステップＳ２０８）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“食べたい”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user utters "I want to eat" in response to the voice output from the speaker 40 (step S208). The voice uttered by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice uttered by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed by the analysis unit 124 through natural language processing. As a result, it is analyzed that the user uttered "I want to eat". The state update unit 125 updates the state information 112 based on the content of the utterance analyzed by the analysis unit 124.

具体的には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報と、解析部１２４により解析された発話内容とを踏まえて、状態情報１１２で示される候補状態の中から該当する候補状態を選択する。例えば、話題決定部１２６から出力された話題に関する情報が“パスタ食べたいか聞く”であり、解析部１２４により解析された発話内容が“食べたい”である。そこで、状態更新部１２５は、該当する候補状態として“パスタ食べたい”を選択する。なお、話題モジュール１１３－ｎにおける話題と、状態情報１１２で示される各候補状態とが予め対応付けられている場合には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報に対応付けられている候補状態を選択すればよい。 Specifically, the state update unit 125 selects a corresponding candidate state from the candidate states indicated in the state information 112, based on the information on the topic output from the topic determination unit 126 and the speech content analyzed by the analysis unit 124. For example, the information on the topic output from the topic determination unit 126 is "asking if you want to eat pasta", and the speech content analyzed by the analysis unit 124 is "want to eat". Then, the state update unit 125 selects "want to eat pasta" as the corresponding candidate state. Note that, if the topic in the topic module 113-n is associated in advance with each candidate state indicated in the state information 112, the state update unit 125 may select the candidate state associated with the information on the topic output from the topic determination unit 126.

状態更新部１２５は、図２４（Ｅ）に示すように、状態情報１１２で示される状態“パスタ食べたい”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２４（Ｅ）に示す通り、状態情報１１２で示される状態“疲れている”、“お腹空いている”、“ラーメン食べたい”及び“パスタ食べたい”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 The state update unit 125 updates the variable "U" associated with the state "I want to eat pasta" shown in the state information 112 to "Y" as shown in FIG. 24 (E). When the state information 112 is updated by the state update unit 125, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 24 (E), the variables of each state shown in the state information 112 at this point are "Y" for the states "Tired", "Hungry", "I want to eat ramen", and "I want to eat pasta" shown in the state information 112, and "U" for the other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112.

例えば、話題決定部１２６は、図２４（Ｅ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態“疲れている”、“お腹空いている”、“ラーメン食べたい”及び“パスタ食べたい”が“Ｙ”であり、それ以外の状態は“Ｕ”）の組み合わせを参照して、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－３を選択する。話題決定部１２６は、選択した話題モジュール１１３－３における話題（例えば、“ラーメン店紹介ＯＫか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 refers to the combination of variables associated with each candidate state in the state information 112 shown in FIG. 24(E) (for example, the candidate states "Tired", "Hungry", "Want to eat ramen", and "Want to eat pasta" are "Y", and other states are "U"), and selects the topic module 113-3 with the highest priority among the topic modules 113-n that include a condition that satisfies the combination of variables defined as the activation condition. The topic determination unit 126 determines the topic in the selected topic module 113-3 (for example, "Ask if it's OK to introduce a ramen shop") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“ラーメン店紹介ＯＫか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“ラーメン店紹介ＯＫか聞く”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“ラーメン食べたいなら、ラーメン店紹介してもいい？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“ラーメン食べたいなら、ラーメン店紹介してもいい？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“ラーメン食べたいなら、ラーメン店紹介してもいい？”という音声がスピーカー４０から出力される（ステップＳ２０９）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "Ask if it's OK to introduce a ramen shop") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "Ask if it's OK to introduce a ramen shop"). In the example shown in FIG. 4, the output language information 114 selects "If you want to eat ramen, can I introduce you to a ramen shop?" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "If you want to eat ramen, can I introduce you to a ramen shop?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the speech "If you want to eat ramen, can I introduce you to a ramen shop?" is output from the speaker 40 (step S209).

ユーザは、スピーカー４０から出力された音声に応じて、“いいよ”と発話したとする（ステップＳ２１０）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“いいよ”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 It is assumed that the user utters "good" in response to the voice output from the speaker 40 (step S210). The voice uttered by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice uttered by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed through natural language processing by the analysis unit 124. As a result, it is analyzed that the user uttered "good". The state update unit 125 updates the state information 112 based on the content of the utterance analyzed by the analysis unit 124.

具体的には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報と、解析部１２４により解析された発話内容とを踏まえて、状態情報１１２で示される候補状態の中から該当する候補状態を選択する。例えば、話題決定部１２６から出力された話題に関する情報が“ラーメン店紹介ＯＫか聞く”であり、解析部１２４により解析された発話内容が“いいよ”である。そこで、状態更新部１２５は、該当する候補状態として“ラーメン店紹介ＯＫ”を選択する。なお、話題モジュール１１３－ｎにおける話題と、状態情報１１２で示される各候補状態とが予め対応付けられている場合には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報に対応付けられている候補状態を選択すればよい。 Specifically, the state update unit 125 selects a corresponding candidate state from the candidate states indicated in the state information 112, based on the information on the topic output from the topic determination unit 126 and the speech content analyzed by the analysis unit 124. For example, the information on the topic output from the topic determination unit 126 is "Ask if it's OK to introduce a ramen shop," and the speech content analyzed by the analysis unit 124 is "Sure." Therefore, the state update unit 125 selects "Ramen shop introduction OK" as the corresponding candidate state. Note that, if the topic in the topic module 113-n and each candidate state indicated in the state information 112 are associated in advance, the state update unit 125 may select the candidate state associated with the information on the topic output from the topic determination unit 126.

状態更新部１２５は、図２４（Ｆ）に示すように、状態情報１１２で示される状態“ラーメン店紹介ＯＫ”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２４（Ｆ）に示す通り、状態情報１１２で示される状態“疲れている”、“お腹空いている”、“ラーメン食べたい”、“パスタ食べたい”及び“ラーメン店紹介ＯＫ”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 The state update unit 125 updates the variable "U" associated with the state "Ramen restaurant introduction OK" shown in the state information 112 to "Y" as shown in FIG. 24 (F). When the state information 112 is updated by the state update unit 125, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 24 (F), the variables of each state shown in the state information 112 at this time are "Y" for the states "Tired", "Hungry", "Want to eat ramen", "Want to eat pasta", and "Ramen restaurant introduction OK" shown in the state information 112, and "U" for the other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition satisfied by a combination of variables associated with each candidate state in the state information 112.

例えば、話題決定部１２６は、図２４（Ｆ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態“疲れている”、“お腹空いている”、“ラーメン食べたい”、“パスタ食べたい”及び“ラーメン店紹介ＯＫ”が“Ｙ”であり、それ以外の状態は“Ｕ”）の組み合わせを参照して、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－４を選択する。話題決定部１２６は、選択した話題モジュール１１３－４における話題（例えば、“パスタ店紹介ＯＫか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 refers to the combination of variables associated with each candidate state in the state information 112 shown in FIG. 24 (F) (for example, the candidate states "Tired", "Hungry", "Want to eat ramen", "Want to eat pasta", and "Ramen restaurant introduction OK" are "Y", and other states are "U"), and selects the topic module 113-4 with the highest priority among the topic modules 113-n that include a condition that satisfies the combination of variables defined as the activation condition. The topic determination unit 126 determines the topic in the selected topic module 113-4 (for example, "Ask if pasta restaurant introduction OK") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“パスタ店紹介ＯＫか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“パスタ店紹介ＯＫか聞く”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“パスタ食べたいなら、パスタ店紹介してもいい？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“パスタ食べたいなら、パスタ店紹介してもいい？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“パスタ食べたいなら、パスタ店紹介してもいい？”という音声がスピーカー４０から出力される（ステップＳ２１１）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "Ask if it's OK to introduce a pasta restaurant") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "Ask if it's OK to introduce a pasta restaurant"). In the example shown in FIG. 4, the output language information 114 selects "If you want to eat pasta, can I introduce you to a pasta restaurant?" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "If you want to eat pasta, can I introduce you to a pasta restaurant?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the voice "If you want to eat pasta, can I introduce you to a pasta restaurant?" is output from the speaker 40 (step S211).

ユーザは、スピーカー４０から出力された音声に応じて、“いいよ”と発話したとする（ステップＳ２１２）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“いいよ”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user responds to the voice output from the speaker 40 by saying "good" (step S212). The voice spoken by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice spoken by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed by the analysis unit 124 through natural language processing. As a result, it is analyzed that the user has said "good". The state update unit 125 updates the state information 112 based on the content of the utterance analyzed by the analysis unit 124.

具体的には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報と、解析部１２４により解析された発話内容とを踏まえて、状態情報１１２で示される候補状態の中から該当する候補状態を選択する。例えば、話題決定部１２６から出力された話題に関する情報が“パスタ店紹介ＯＫか聞く”であり、解析部１２４により解析された発話内容が“いいよ”である。そこで、状態更新部１２５は、該当する候補状態として“パスタ店紹介ＯＫ”を選択する。なお、話題モジュール１１３－ｎにおける話題と、状態情報１１２で示される各候補状態とが予め対応付けられている場合には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報に対応付けられている候補状態を選択すればよい。 Specifically, the state update unit 125 selects a corresponding candidate state from the candidate states indicated in the state information 112, based on the information on the topic output from the topic determination unit 126 and the speech content analyzed by the analysis unit 124. For example, the information on the topic output from the topic determination unit 126 is "Ask if it's OK to introduce a pasta restaurant," and the speech content analyzed by the analysis unit 124 is "Sure." Therefore, the state update unit 125 selects "Pasta restaurant introduction OK" as the corresponding candidate state. Note that, if the topic in the topic module 113-n and each candidate state indicated in the state information 112 are associated in advance, the state update unit 125 may select the candidate state associated with the information on the topic output from the topic determination unit 126.

状態更新部１２５は、図２５（Ａ）に示すように、状態情報１１２で示される状態“パスタ店紹介ＯＫ”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２５（Ａ）に示す通り、状態情報１１２で示される状態“疲れている”、“お腹空いている”、“ラーメン食べたい”、“パスタ食べたい”、“ラーメン店紹介ＯＫ”及び“パスタ店紹介ＯＫ”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 25(A), the state update unit 125 updates the variable "U" associated with the state "pasta restaurant introduction OK" shown in the state information 112 to "Y". When the state update unit 125 updates the state information 112, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 25(A), the variables of each state shown in the state information 112 at this time are "Y" for the states "Tired", "Hungry", "Want to eat ramen", "Want to eat pasta", "Ramen restaurant introduction OK" and "Pasta restaurant introduction OK" shown in the state information 112, and "U" for the other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition satisfied by a combination of variables associated with each candidate state in the state information 112.

例えば、話題決定部１２６は、図２５（Ａ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態が“疲れている”、“お腹空いている”、“ラーメン食べたい”、“パスタ食べたい”、“ラーメン店紹介ＯＫ”及び“パスタ店紹介ＯＫ”が“Ｙ”であり、それ以外の状態は“Ｕ”）の組み合わせを参照して、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－１を選択する。話題決定部１２６は、選択した話題モジュール１１３－１における話題（例えば、“ラーメン店を推薦する”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 refers to the combination of variables associated with each candidate state in the state information 112 shown in FIG. 25(A) (for example, the candidate states "Tired", "Hungry", "Want to eat ramen", "Want to eat pasta", "Ramen restaurant introduction OK", and "Pasta restaurant introduction OK" are "Y", and other states are "U"), and selects the topic module 113-1 with the highest priority among the topic modules 113-n that include a condition that satisfies the combination of variables defined as the activation condition. The topic determination unit 126 determines the topic in the selected topic module 113-1 (for example, "Recommend a ramen restaurant") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“ラーメン店を推薦する”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“ラーメン店を推薦する”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“ラーメン店紹介ＯＫなら、〇〇っていうラーメン屋がおすすめ”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“ラーメン店紹介ＯＫなら、〇〇っていうラーメン屋がおすすめ”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“ラーメン店紹介ＯＫなら、〇〇っていうラーメン屋がおすすめ”という音声がスピーカー４０から出力される（ステップＳ２１３）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "recommend a ramen shop") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "recommend a ramen shop"). In the example shown in FIG. 4, the output language information 114 selects "If you are OK with introducing a ramen shop, I recommend a ramen shop called XX" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "If you are OK with introducing a ramen shop, I recommend a ramen shop called XX" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, a voice saying "If you are OK with introducing a ramen shop, I recommend a ramen shop called XX" is output from the speaker 40 (step S213).

ユーザは、スピーカー４０から出力された音声に応じて、“分かった”と発話したとする（ステップＳ２１４）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“分かった”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user responds to the voice output from the speaker 40 by uttering "I understand" (step S214). The voice uttered by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice uttered by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed through natural language processing by the analysis unit 124. This analyzes that the user has uttered "I understand." The state update unit 125 updates the state information 112 based on the content of the utterance analyzed by the analysis unit 124.

状態更新部１２５は、図２５（Ｂ）に示すように、状態情報１１２で示される状態“ラーメン店推薦”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２５（Ｂ）に示す通り、状態情報１１２で示される状態“疲れている”、“お腹空いている”、“ラーメン食べたい”、“パスタ食べたい”、“ラーメン店紹介ＯＫ”、“パスタ店紹介ＯＫ”及び“ラーメン店推薦”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 25(B), the state update unit 125 updates the variable "U" associated with the state "ramen shop recommendation" shown in the state information 112 to "Y". When the state update unit 125 updates the state information 112, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 25(B), the variables of each state shown in the state information 112 at this time are "Y" for the states "Tired", "Hungry", "Want to eat ramen", "Want to eat pasta", "Ramen shop introduction OK", "Pasta shop introduction OK" and "Ramen shop recommendation" shown in the state information 112, and "U" for the other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112.

例えば、話題決定部１２６は、図２５（Ｂ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態“疲れている”、“お腹空いている”、“ラーメン食べたい”、“パスタ食べたい”、“ラーメン店紹介ＯＫ”、“パスタ店紹介ＯＫ”及び“ラーメン店推薦”が“Ｙ”であり、それ以外の状態は“Ｕ”）の組み合わせを参照して、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－２を選択する。話題決定部１２６は、選択した話題モジュール１１３－２における話題（例えば、“パスタ店を推薦する”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 refers to combinations of variables associated with each candidate state in the state information 112 shown in FIG. 25(B) (for example, the candidate states "Tired", "Hungry", "Want to eat ramen", "Want to eat pasta", "Ramen restaurant introduction OK", "Pasta restaurant introduction OK", and "Ramen restaurant recommendation" are "Y", and other states are "U"), and selects the topic module 113-2 with the highest priority among the topic modules 113-n that include conditions that satisfy the combination of variables defined as the activation condition. The topic determination unit 126 determines the topic in the selected topic module 113-2 (for example, "Recommend a pasta restaurant") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“パスタ店を推薦する”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“パスタ店を推薦する”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“パスタ店紹介ＯＫなら、〇〇っていうパスタ屋がおすすめ”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“パスタ店紹介ＯＫなら、〇〇っていうパスタ屋がおすすめ”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“パスタ店紹介ＯＫなら、〇〇っていうパスタ屋がおすすめ”という音声がスピーカー４０から出力される（ステップＳ２１５）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "recommend a pasta restaurant") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "recommend a pasta restaurant"). In the example shown in FIG. 4, the output language information 114 selects "If you're OK with a pasta restaurant introduction, I recommend a pasta restaurant called XX" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "If you're OK with a pasta restaurant introduction, I recommend a pasta restaurant called XX" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, a voice saying "If you're OK with a pasta restaurant introduction, I recommend a pasta restaurant called XX" is output from the speaker 40 (step S215).

ユーザは、スピーカー４０から出力された音声に応じて、“分かった”と発話したとする（ステップＳ２１６）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“分かった”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 It is assumed that the user utters "I understand" in response to the voice output from the speaker 40 (step S216). The voice uttered by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice uttered by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed through natural language processing by the analysis unit 124. As a result, it is analyzed that the user uttered "I understand". The state update unit 125 updates the state information 112 based on the content of the utterance analyzed by the analysis unit 124.

具体的には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報と、解析部１２４により解析された発話内容とを踏まえて、状態情報１１２で示される候補状態の中から該当する候補状態を選択する。例えば、話題決定部１２６から出力された話題に関する情報が“パスタ店を推薦する”であり、解析部１２４により解析された発話内容が“分かった”である。そこで、状態更新部１２５は、該当する候補状態として“パスタ店推薦聞いた”を選択する。なお、話題モジュール１１３－ｎにおける話題と、状態情報１１２で示される各候補状態とが予め対応付けられている場合には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報に対応付けられている候補状態を選択すればよい。 Specifically, the state update unit 125 selects a corresponding candidate state from the candidate states indicated in the state information 112, based on the information on the topic output from the topic determination unit 126 and the speech content analyzed by the analysis unit 124. For example, the information on the topic output from the topic determination unit 126 is "recommend a pasta restaurant", and the speech content analyzed by the analysis unit 124 is "I understand". Then, the state update unit 125 selects "I heard a pasta restaurant recommended" as the corresponding candidate state. Note that, if the topic in the topic module 113-n is associated in advance with each candidate state indicated in the state information 112, the state update unit 125 may select the candidate state associated with the information on the topic output from the topic determination unit 126.

状態更新部１２５は、図２５（Ｃ）に示すように、状態情報１１２で示される状態“パスタ店推薦聞いた”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２５（Ｃ）に示す通り、状態情報１１２で示される状態“お腹空いている”、“ラーメン食べたい”、“ラーメン店紹介ＯＫ”、“ラーメン店推薦聞いた”及び“パスタ店推薦聞いた”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 The state update unit 125 updates the variable "U" associated with the state "I heard a pasta restaurant recommendation" shown in the state information 112 to "Y" as shown in FIG. 25(C). When the state information 112 is updated by the state update unit 125, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 25(C), the variables of each state shown in the state information 112 at this point are "Y" for the states "I'm hungry", "I want to eat ramen", "I'm OK with a ramen restaurant recommendation", "I heard a ramen restaurant recommendation" and "I heard a pasta restaurant recommendation" shown in the state information 112, and "U" for the other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112.

図２６は、話題モジュールセット１１３の一例（その３）を示す図である。図２６に示す例では、全ての話題モジュール１１３－ｎを、優先したいゴールに関する話題を優先的に並べている。図２６では、「ラーメン店を推薦する」というゴールを「パスタ店を推薦する」というゴールよりも優先するような配置としている。このような順に並べることで、複数あるゴールのうちどれだけゴールまでの距離が遠くてもいいので優先度の高いゴールを最優先とすることができる。 Figure 26 is a diagram showing an example (part 3) of a topic module set 113. In the example shown in Figure 26, all topic modules 113-n are arranged in order of priority to topics related to goals that are to be prioritized. In Figure 26, the modules are arranged so that the goal "recommend a ramen shop" is given priority over the goal "recommend a pasta shop." By arranging the modules in this order, it is possible to give top priority to a goal with a high priority, regardless of how far the goal may be.

［対話システム１００の処理（その３）］
図２７は、実施形態における対話システム１００の処理の流れを示すシーケンス図（その３）である。なお、図２７の処理の説明では、話題モジュール１１３－ｎが図２６に示す階層構造で配置されているものとする。図２７において、ユーザから対話装置１０に向かう矢印上の文字列は対話装置１０が出力する音声であり、対話装置１０からユーザに向かう矢印上の文字列は対話装置１０が解析したユーザの発話内容である。さらに、図２７の処理開始時の状態情報１１２で示される各状態の変数は初期値であるものとする。 [Processing of the dialogue system 100 (part 3)]
Fig. 27 is a sequence diagram (part 3) showing the flow of processing of the dialogue system 100 in the embodiment. In the explanation of the processing of Fig. 27, it is assumed that the topic modules 113-n are arranged in the hierarchical structure shown in Fig. 26. In Fig. 27, the character string on the arrow from the user to the dialogue device 10 is the voice output by the dialogue device 10, and the character string on the arrow from the dialogue device 10 to the user is the content of the user's utterance analyzed by the dialogue device 10. Furthermore, it is assumed that the variables of each state shown in the state information 112 at the start of the processing of Fig. 27 are initial values.

対話装置１０の話題決定部１２６は、検出部１２２によりユーザの行動が検知されて状態更新部１２５により状態情報１１２の情報が更新されたことを契機に、状態情報１１２と話題モジュールセット１１３とに基づいて話題を決定する。図２７の処理開始時の状態情報１１２で示される各状態の変数は、図２８（Ａ）に示す通り“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。例えば、話題決定部１２６は、図２８（Ａ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、各候補状態が“Ｕ”）の組み合わせを参照して、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－７を選択する。話題決定部１２６は、選択した話題モジュール１１３－７における話題（例えば、“お腹空いているか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 The topic determination unit 126 of the dialogue device 10 determines a topic based on the state information 112 and the topic module set 113 when the detection unit 122 detects the user's behavior and the state update unit 125 updates the state information 112. The variables of each state shown in the state information 112 at the start of processing in FIG. 27 are "U" as shown in FIG. 28(A). The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112 and has the highest priority. For example, the topic determination unit 126 refers to the combination of variables (e.g., each candidate state is "U") associated with each candidate state in the state information 112 shown in FIG. 28(A) and selects the topic module 113-7 with the highest priority among the topic modules 113-n that include a condition that satisfies the combination of variables defined as the activation condition. The topic determination unit 126 determines the topic in the selected topic module 113-7 (e.g., "ask if you're hungry") as the topic to be output. The topic determination unit 126 outputs information about the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“お腹空いているか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“お腹空いているか聞く”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“お腹空いてない？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“お腹空いてない？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“お腹空いてない？”という音声がスピーカー４０から出力される（ステップＳ３０１）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "Ask if you're hungry") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "Ask if you're hungry"). In the example shown in FIG. 4, the output language information 114 selects "Are you hungry?" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "Are you hungry?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the speech "Are you hungry?" is output from the speaker 40 (step S301).

ユーザは、スピーカー４０から出力された音声に応じて、“空いた”と発話したとする（ステップＳ３０２）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“空いた”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user utters "It's free" in response to the voice output from the speaker 40 (step S302). The voice uttered by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice uttered by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed by the analysis unit 124 through natural language processing. As a result, it is analyzed that the user uttered "It's free". The status update unit 125 updates the status information 112 based on the content of the utterance analyzed by the analysis unit 124.

状態更新部１２５は、図２８（Ｂ）に示すように、状態情報１１２で示される状態“お腹空いている”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２８（Ｂ）に示す通り、状態情報１１２で示される状態“お腹空いている”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 As shown in FIG. 28(B), the state update unit 125 updates the variable "U" associated with the state "hungry" shown in the state information 112 to "Y". When the state update unit 125 updates the state information 112, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 28(B), the variables of each state shown in the state information 112 at this point are "Y" for the state "hungry" shown in the state information 112, and "U" for other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112 and has the highest priority.

例えば、話題決定部１２６は、図２８（Ｂ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態“お腹空いている”が“Ｙ”であり、それ以外の状態は“Ｕ”）の組み合わせを参照して、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－５を選択する。話題決定部１２６は、選択した話題モジュール１１３－５における話題（例えば、“ラーメン食べたいか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 refers to the combination of variables associated with each candidate state in the state information 112 shown in FIG. 28 (B) (for example, the candidate state "hungry" is "Y" and other states are "U"), and selects the topic module 113-5 with the highest priority among the topic modules 113-n that include a condition that satisfies the combination of variables defined as the activation condition. The topic determination unit 126 determines the topic in the selected topic module 113-5 (for example, "asking if the person wants to eat ramen") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“ラーメン食べたいか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“ラーメン食べたいか聞く”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“お腹空いているなら、ラーメン食べたくない？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“お腹空いているなら、ラーメン食べたくない？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“お腹空いているなら、ラーメン食べたくない？”という音声がスピーカー４０から出力される（ステップＳ３０３）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "ask if you want to eat ramen") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "ask if you want to eat ramen"). In the example shown in FIG. 4, the output language information 114 selects "If you're hungry, don't you want to eat ramen?" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "If you're hungry, don't you want to eat ramen?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the speech "If you're hungry, don't you want to eat ramen?" is output from the speaker 40 (step S303).

ユーザは、スピーカー４０から出力された音声に応じて、“食べたくない”と発話したとする（ステップＳ３０４）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“食べたくない”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user utters "I don't want to eat" in response to the voice output from the speaker 40 (step S304). The voice uttered by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice uttered by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed by the analysis unit 124 through natural language processing. As a result, it is analyzed that the user has uttered "I don't want to eat". The state update unit 125 updates the state information 112 based on the content of the utterance analyzed by the analysis unit 124.

具体的には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報と、解析部１２４により解析された発話内容とを踏まえて、状態情報１１２で示される候補状態の中から該当する候補状態を選択する。例えば、話題決定部１２６から出力された話題に関する情報が“ラーメン食べたいか聞く”であり、解析部１２４により解析された発話内容が“食べたくない”である。そこで、状態更新部１２５は、該当する候補状態として“ラーメン食べたい”を選択する。なお、話題モジュール１１３－ｎにおける話題と、状態情報１１２で示される各候補状態とが予め対応付けられている場合には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報に対応付けられている候補状態を選択すればよい。 Specifically, the state update unit 125 selects a corresponding candidate state from the candidate states indicated in the state information 112, based on the information on the topic output from the topic determination unit 126 and the speech content analyzed by the analysis unit 124. For example, the information on the topic output from the topic determination unit 126 is "asking if you want to eat ramen", and the speech content analyzed by the analysis unit 124 is "don't want to eat". Then, the state update unit 125 selects "want to eat ramen" as the corresponding candidate state. Note that, if the topic in the topic module 113-n is associated in advance with each candidate state indicated in the state information 112, the state update unit 125 may select the candidate state associated with the information on the topic output from the topic determination unit 126.

状態更新部１２５は、図２８（Ｃ）に示すように、状態情報１１２で示される状態“ラーメン食べたい”に対応付けられている変数“Ｕ”を“Ｎ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２８（Ｃ）に示す通り、状態情報１１２で示される状態“お腹空いている”が“Ｙ”であり、状態情報１１２で示される状態“ラーメン食べたい”が“Ｎ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 As shown in FIG. 28(C), the state update unit 125 updates the variable "U" associated with the state "I want to eat ramen" shown in the state information 112 to "N". When the state update unit 125 updates the state information 112, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 28(C), the variables of each state shown in the state information 112 at this point are "Y" for the state "I'm hungry" shown in the state information 112, "N" for the state "I want to eat ramen" shown in the state information 112, and "U" for the other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112.

例えば、話題決定部１２６は、図２８（Ｃ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態が“お腹空いている”が“Ｙ”であり、状態情報１１２で示される状態“ラーメン食べたい”が“Ｎ”であり、それ以外の状態は“Ｕ”）の組み合わせを参照して、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－６を選択する。話題決定部１２６は、選択した話題モジュール１１３－６における話題（例えば、“パスタ食べたいか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 refers to the combination of variables associated with each candidate state in the state information 112 shown in FIG. 28(C) (for example, the candidate state "hungry" is "Y", the state "want to eat ramen" shown in the state information 112 is "N", and other states are "U"), and selects the topic module 113-6 with the highest priority among the topic modules 113-n that include a condition that satisfies the combination of variables defined as the activation condition. The topic determination unit 126 determines the topic in the selected topic module 113-6 (for example, "ask if they want to eat pasta") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“パスタ食べたいか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“パスタ食べたいか聞く”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“お腹空いているなら、パスタ食べたくない？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“お腹空いているなら、パスタ食べたくない？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“お腹空いているなら、パスタ食べたくない？”という音声がスピーカー４０から出力される（ステップＳ３０５）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "ask if you want to eat pasta") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "ask if you want to eat pasta"). In the example shown in FIG. 4, the output language information 114 selects "If you're hungry, don't you want to eat pasta?" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "If you're hungry, don't you want to eat pasta?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the speech "If you're hungry, don't you want to eat pasta?" is output from the speaker 40 (step S305).

状態更新部１２５は、図２８（Ｄ）に示すように、状態情報１１２で示される状態“パスタ食べたい”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２８（Ｄ）に示す通り、状態情報１１２で示される状態“お腹空いている”及び“パスタ食べたい”が“Ｙ”であり、状態情報１１２で示される状態“ラーメン食べたい”が“Ｎ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 The state update unit 125 updates the variable "U" associated with the state "I want to eat pasta" shown in the state information 112 to "Y" as shown in FIG. 28 (D). When the state information 112 is updated by the state update unit 125, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 28 (D), the variables of each state shown in the state information 112 at this point are "Y" for the states "I'm hungry" and "I want to eat pasta" shown in the state information 112, "N" for the state "I want to eat ramen" shown in the state information 112, and "U" for the other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112.

例えば、話題決定部１２６は、図２８（Ｄ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態“お腹空いている”及び“パスタ食べたい”が“Ｙ”であり、状態情報１１２で示される状態“ラーメン食べたい”が“Ｎ”であり、それ以外の状態は“Ｕ”）の組み合わせを参照して、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－４を選択する。話題決定部１２６は、選択した話題モジュール１１３－４における話題（例えば、“パスタ店紹介ＯＫか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 refers to the combination of variables associated with each candidate state in the state information 112 shown in FIG. 28 (D) (for example, the candidate states "hungry" and "want to eat pasta" are "Y", the state "want to eat ramen" shown in the state information 112 is "N", and other states are "U"), and selects the topic module 113-4 with the highest priority among the topic modules 113-n that include a condition that satisfies the combination of variables defined as the activation condition. The topic determination unit 126 determines the topic in the selected topic module 113-4 (for example, "ask if it's OK to introduce a pasta restaurant") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“パスタ店紹介ＯＫか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“パスタ店紹介ＯＫか聞く”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“パスタ食べたいなら、パスタ店紹介してもいい？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“パスタ食べたいなら、パスタ店紹介してもいい？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“パスタ食べたいなら、パスタ店紹介してもいい？”という音声がスピーカー４０から出力される（ステップＳ３０７）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "Ask if it's OK to introduce a pasta restaurant") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "Ask if it's OK to introduce a pasta restaurant"). In the example shown in FIG. 4, the output language information 114 selects "If you want to eat pasta, can I introduce you to a pasta restaurant?" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "If you want to eat pasta, can I introduce you to a pasta restaurant?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the voice "If you want to eat pasta, can I introduce you to a pasta restaurant?" is output from the speaker 40 (step S307).

ユーザは、スピーカー４０から出力された音声に応じて、“いいよ”と発話したとする（ステップＳ２０８）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“いいよ”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user responds to the voice output from the speaker 40 by saying "good" (step S208). The voice spoken by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice spoken by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed by the analysis unit 124 through natural language processing. As a result, it is analyzed that the user has said "good". The state update unit 125 updates the state information 112 based on the content of the utterance analyzed by the analysis unit 124.

状態更新部１２５は、図２８（Ｅ）に示すように、状態情報１１２で示される状態“パスタ店紹介ＯＫ”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２８（Ｅ）に示す通り、状態情報１１２で示される状態“お腹空いている”、“パスタ食べたい”及び“パスタ店紹介ＯＫ”が“Ｙ”であり、状態情報１１２で示される状態“ラーメン食べたい”が“Ｎ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 The state update unit 125 updates the variable "U" associated with the state "pasta restaurant introduction OK" shown in the state information 112 to "Y" as shown in FIG. 28 (E). When the state information 112 is updated by the state update unit 125, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 28 (E), the variables of each state shown in the state information 112 at this time are "Y" for the states "hungry", "want to eat pasta", and "pasta restaurant introduction OK" shown in the state information 112, "N" for the state "want to eat ramen" shown in the state information 112, and "U" for the other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112.

例えば、話題決定部１２６は、図２８（Ｅ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態“お腹空いている”、“パスタ食べたい”及び“パスタ店紹介ＯＫ”が“Ｙ”であり、状態情報１１２で示される状態“ラーメン食べたい”が“Ｎ”であり、それ以外の状態は“Ｕ”）の組み合わせを参照して、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－２を選択する。話題決定部１２６は、選択した話題モジュール１１３－２における話題（例えば、“パスタ店を推薦する”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 refers to the combination of variables associated with each candidate state in the state information 112 shown in FIG. 28(E) (for example, the candidate states "hungry", "want to eat pasta", and "pasta restaurant recommendation OK" are "Y", the state "want to eat ramen" shown in the state information 112 is "N", and other states are "U"), and selects the topic module 113-2 with the highest priority among the topic modules 113-n that include a condition that satisfies the combination of variables defined as the activation condition. The topic determination unit 126 determines the topic in the selected topic module 113-2 (for example, "recommend a pasta restaurant") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“パスタ店を推薦する”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“パスタ店を推薦する”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“パスタ店紹介ＯＫなら、〇〇っていうパスタ屋がおすすめ”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“パスタ店紹介ＯＫなら、〇〇っていうパスタ屋がおすすめ”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“パスタ店紹介ＯＫなら、〇〇っていうパスタ屋がおすすめ”という音声がスピーカー４０から出力される（ステップＳ３０９）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "recommend a pasta restaurant") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "recommend a pasta restaurant"). In the example shown in FIG. 4, the output language information 114 selects "If you're OK with a pasta restaurant introduction, I recommend a pasta restaurant called XX" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "If you're OK with a pasta restaurant introduction, I recommend a pasta restaurant called XX" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, a voice saying "If you're OK with a pasta restaurant introduction, I recommend a pasta restaurant called XX" is output from the speaker 40 (step S309).

ユーザは、スピーカー４０から出力された音声に応じて、“分かった”と発話したとする（ステップＳ３１０）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“分かった”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 It is assumed that the user utters "I understand" in response to the voice output from the speaker 40 (step S310). The voice uttered by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice uttered by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed by the analysis unit 124 through natural language processing. As a result, it is analyzed that the user has uttered "I understand". The state update unit 125 updates the state information 112 based on the content of the utterance analyzed by the analysis unit 124.

状態更新部１２５は、図２８（Ｆ）に示すように、状態情報１１２で示される状態“パスタ店推薦聞いた”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図２８（Ｆ）に示す通り、状態情報１１２で示される状態“お腹空いている”、“パスタ食べたい”、“パスタ店紹介ＯＫ”及び“パスタ店推薦聞いた”が“Ｙ”であり、状態情報１１２で示される状態“ラーメン食べたい”が“Ｎ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 As shown in FIG. 28 (F), the state update unit 125 updates the variable "U" associated with the state "I heard a pasta restaurant recommendation" shown in the state information 112 to "Y". When the state update unit 125 updates the state information 112, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 28 (F), the variables of each state shown in the state information 112 at this time are "Y" for the states "I'm hungry", "I want to eat pasta", "Pasta restaurant recommendation OK" and "I heard a pasta restaurant recommendation" shown in the state information 112, "N" for the state "I want to eat ramen" shown in the state information 112, and "U" for the other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112 and has the highest priority.

上述した説明では、ユーザが対話装置１０からの出力に対して肯定的な回答のみをする場合の処理を示していた。一般的な対話では、否定的な回答がなされることも想定される。本発明における対話装置１０では、話題モジュールセットを構成する話題モジュールの組み合わせによって、否定的な回答がなされた場合においても自然な話題転換が可能である。以下、詳細に説明する。まず説明するにあたり、図２９に示すような話題モジュールセット１１３を用いる。図２９は、実施形態における話題モジュールセット１１３の一例（その４）を示す図である。話題モジュールセット１１３は、例えば、図２９に示すように予め定められた優先順位で各話題モジュール１１３－１１～１１３－１８が階層構造に配置されて構成される。 The above explanation shows the process when the user only gives positive responses to the output from the dialogue device 10. In a typical dialogue, it is also assumed that a negative response will be given. In the dialogue device 10 of the present invention, a natural change of topic is possible even when a negative response is given, by using a combination of topic modules that constitute a topic module set. A detailed explanation will be given below. First, a topic module set 113 as shown in FIG. 29 is used for the explanation. FIG. 29 is a diagram showing an example (part 4) of a topic module set 113 in the embodiment. The topic module set 113 is configured, for example, by arranging each topic module 113-11 to 113-18 in a hierarchical structure in a predetermined priority order as shown in FIG. 29.

図２９に示す例では、全ての話題モジュール１１３－１１～１１３－１８を、ゴールから近い距離（ゴールまでに経由する話題の数）順に並べている。例えば、ユーザとの対話で達成したいゴールが、「ラーメン店を推薦する」と「マッサージ店を推薦する」とであり、「ラーメン店を推薦する」というゴールに至るまでの話題が「お腹空いている？」⇒「ラーメン食べたい？」⇒「ラーメン店紹介ＯＫ？」⇒「ラーメン店を推薦する」であり、「マッサージ店を推薦する」というゴールに至るまでの話題が「疲れた？」⇒「癒す方法を知りたい？」⇒「マッサージ店紹介ＯＫ？」⇒「マッサージ店を推薦する」であるとする。 In the example shown in FIG. 29, all topic modules 113-11 to 113-18 are arranged in order of distance from the goal (the number of topics passed through to reach the goal). For example, suppose that the goals to be achieved in a dialogue with a user are "recommend a ramen shop" and "recommend a massage shop", and the topics leading to the goal of "recommend a ramen shop" are "Are you hungry?" ⇒ "Want to eat ramen?" ⇒ "Can we recommend a ramen shop?" ⇒ "Recommend a ramen shop", and the topics leading to the goal of "recommend a massage shop" are "Are you tired?" ⇒ "Want to know how to relax?" ⇒ "Can we recommend a massage shop?" ⇒ "Recommend a massage shop".

この場合、「疲れた？」及び「お腹空いている？」の話題が最もゴールまでの距離が遠く（距離＝３）、「ラーメン食べたい？」及び「癒す方法を知りたい？」との話題が次にゴールまでの距離が遠く（距離＝２）、「ラーメン店紹介ＯＫ？」及び「マッサージ店紹介ＯＫ？」との話題が最もゴールまでの距離が近い（距離＝１）。そして、各話題を予め定められた優先順位（例えば、ゴールから近い距離順、かつ、マッサージよりラーメン優先）で並べて、対応する起動条件を設定することで図２９に示す構成となる。 In this case, the topics "Are you tired?" and "Are you hungry?" are the furthest from the goal (distance = 3), followed by "Do you want to eat ramen?" and "Want to know how to relax?" (distance = 2), and then "Can I recommend a ramen shop?" and "Can I recommend a massage shop?" which are the closest to the goal (distance = 1). Then, by arranging each topic in a predetermined order of priority (for example, in order of distance from the goal, and giving ramen priority over massage), and setting the corresponding activation conditions, the configuration shown in FIG. 29 is obtained.

［対話システム１００の処理（その４）］
図３０は、実施形態における対話システム１００の処理の流れを示すシーケンス図（その４）である。なお、図３０の処理の説明では、話題モジュール１１３－ｎが図２９に示す階層構造で配置されているものとする。図３０において、ユーザから対話装置１０に向かう矢印上の文字列は対話装置１０が出力する音声であり、対話装置１０からユーザに向かう矢印上の文字列は対話装置１０が解析したユーザの発話内容である。さらに、図３０の処理開始時の状態情報１１２で示される各状態の変数は初期値（例えば、Ｕ）であるものとする。 [Processing of the dialogue system 100 (part 4)]
Fig. 30 is a sequence diagram (part 4) showing the flow of processing of the dialogue system 100 in the embodiment. In the explanation of the processing of Fig. 30, it is assumed that the topic modules 113-n are arranged in the hierarchical structure shown in Fig. 29. In Fig. 30, the character string on the arrow from the user to the dialogue device 10 is the voice output by the dialogue device 10, and the character string on the arrow from the dialogue device 10 to the user is the content of the user's utterance analyzed by the dialogue device 10. Furthermore, it is assumed that the variables of each state shown in the state information 112 at the start of the processing of Fig. 30 are initial values (e.g., U).

対話装置１０の話題決定部１２６は、処理開始時において検出部１２２によりユーザの行動が検知されて状態更新部１２５により状態情報１１２の情報が更新されたことを契機に、状態情報１１２と話題モジュールセット１１３とに基づいて話題を決定する。図３０の処理開始時の状態情報１１２で示される各状態の変数は、図３１（Ａ）に示す通り“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の高い話題モジュール１１３－ｎを選択する。例えば、話題決定部１２６は、図３１（Ａ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、各候補状態が“Ｕ”）の組み合わせであって、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－１７を選択する。話題決定部１２６は、選択した話題モジュール１１３－１７における話題（例えば、“お腹空いているか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 The topic determination unit 126 of the dialogue device 10 determines a topic based on the state information 112 and the topic module set 113 when the detection unit 122 detects the user's behavior at the start of processing and the state update unit 125 updates the information of the state information 112. The variables of each state shown in the state information 112 at the start of processing in FIG. 30 are "U" as shown in FIG. 31 (A). The topic determination unit 126 refers to the topic module set 113 and selects a topic module 113-n that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112 and has a high priority. For example, the topic determination unit 126 selects the topic module 113-17 with the highest priority among the topic modules 113-n that include a combination of variables (for example, each candidate state is "U") associated with each candidate state in the state information 112 shown in FIG. 31 (A) and that satisfies the combination of variables defined as the start condition. The topic determination unit 126 determines the topic in the selected topic module 113-17 (e.g., "ask if you're hungry") as the topic to be output. The topic determination unit 126 outputs information about the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“お腹空いているか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“お腹空いているか聞く”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“お腹空いてない？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“お腹空いてない？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“お腹空いてない？”という音声がスピーカー４０から出力される（ステップＳ４０１）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "Ask if you're hungry") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "Ask if you're hungry"). In the example shown in FIG. 4, the output language information 114 selects "Are you hungry?" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "Are you hungry?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the speech "Are you hungry?" is output from the speaker 40 (step S401).

ユーザは、スピーカー４０から出力された音声に応じて、“空いてない”と発話したとする（ステップＳ４０２）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“空いてない”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user responds to the voice output from the speaker 40 by saying "It's not free" (step S402). The voice spoken by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice spoken by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed by the analysis unit 124 through natural language processing. This analyzes that the user has said "It's not free". The status update unit 125 updates the status information 112 based on the content of the utterance analyzed by the analysis unit 124.

具体的には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報と、解析部１２４により解析された発話内容とを踏まえて、状態情報１１２で示される候補状態の中から該当する候補状態を選択する。例えば、話題決定部１２６から出力された話題に関する情報が“お腹空いているか聞く”であり、解析部１２４により解析された発話内容が“空いてない”である。そこで、状態更新部１２５は、該当する候補状態として“お腹空いている”を選択する。なお、話題モジュール１１３－ｎにおける話題と、状態情報１１２で示される各候補状態とが予め対応付けられている場合には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報に対応付けられている候補状態を選択すればよい。 Specifically, the state update unit 125 selects a corresponding candidate state from the candidate states indicated in the state information 112, based on the information on the topic output from the topic determination unit 126 and the utterance content analyzed by the analysis unit 124. For example, the information on the topic output from the topic determination unit 126 is "asking if you're hungry," and the utterance content analyzed by the analysis unit 124 is "not hungry." Therefore, the state update unit 125 selects "hungry" as the corresponding candidate state. Note that, if the topic in the topic module 113-n is associated in advance with each candidate state indicated in the state information 112, the state update unit 125 may select the candidate state associated with the information on the topic output from the topic determination unit 126.

状態更新部１２５は、図３１（Ｂ）に示すように、状態情報１１２で示される状態“お腹空いている”に対応付けられている変数“Ｕ”を“Ｎ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図３１（Ｂ）に示す通り、状態情報１１２で示される状態“お腹空いている”が“Ｎ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 As shown in FIG. 31(B), the state update unit 125 updates the variable "U" associated with the state "hungry" shown in the state information 112 to "N". When the state update unit 125 updates the state information 112, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 31(B), the variables of each state shown in the state information 112 at this point are "N" for the state "hungry" shown in the state information 112, and "U" for other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition satisfied by a combination of variables associated with each candidate state in the state information 112.

例えば、話題決定部１２６は、図３１（Ｂ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態“お腹空いている”が“Ｎ”であり、それ以外の状態は“Ｕ”）の組み合わせであって、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－１８を選択する。話題決定部１２６は、選択した話題モジュール１１３－１８における話題（例えば、“疲れているか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 selects the topic module 113-18 with the highest priority from among the topic modules 113-n that contain a condition that satisfies the combination of variables defined as the activation condition and is a combination of variables associated with each candidate state in the state information 112 shown in FIG. 31 (B) (for example, the candidate state "hungry" is "N" and other states are "U"). The topic determination unit 126 determines the topic in the selected topic module 113-18 (for example, "asking if the person is tired") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“疲れているか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図４に示す出力言語情報１１４を参照し、話題決定部１２６により決定された話題（例えば、“疲れているか聞く”）に対応する出力音声文字列を選択する。図４に示す例では、出力言語情報１１４は、出力音声文字列として“疲れてない？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“疲れてない？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“疲れてない？”という音声がスピーカー４０から出力される（ステップＳ４０３）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "Ask if you're tired") and the output language information 114. Specifically, the language generation unit 127 refers to the output language information 114 shown in FIG. 4 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "Ask if you're tired"). In the example shown in FIG. 4, the output language information 114 selects "Aren't you tired?" as the output speech character string. The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "Aren't you tired?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the speech "Aren't you tired?" is output from the speaker 40 (step S403).

ユーザは、スピーカー４０から出力された音声に応じて、“疲れている”と発話したとする（ステップＳ４０４）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“疲れている”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user utters "I'm tired" in response to the voice output from the speaker 40 (step S404). The voice uttered by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice uttered by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed by the analysis unit 124 through natural language processing. As a result, it is analyzed that the user has uttered "I'm tired". The status update unit 125 updates the status information 112 based on the content of the utterance analyzed by the analysis unit 124.

状態更新部１２５は、図３１（Ｃ）に示すように、状態情報１１２で示される状態“疲れている”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図３１（Ｃ）に示す通り、状態情報１１２で示される状態“お腹空いている”が“Ｎ”であり、“疲れている”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 The state update unit 125 updates the variable "U" associated with the state "tired" shown in the state information 112 to "Y" as shown in FIG. 31 (C). When the state information 112 is updated by the state update unit 125, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 31 (C), the variables of each state shown in the state information 112 at this point are "N" for the state "hungry" shown in the state information 112, "Y" for "tired", and "U" for other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition satisfied by a combination of variables associated with each candidate state in the state information 112.

例えば、話題決定部１２６は、図３１（Ｃ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態“お腹空いている”が“Ｎ”であり、“疲れている”が“Ｙ”であり、それ以外の状態は“Ｕ”）の組み合わせであって、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－１６を選択する。話題決定部１２６は、選択した話題モジュール１１３－１６における話題（例えば、“癒す方法を知りたいか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 selects the topic module 113-16 with the highest priority from among the topic modules 113-n that contain conditions that satisfy the combination of variables defined as the activation conditions and are a combination of variables associated with each candidate state in the state information 112 shown in FIG. 31 (C) (for example, the candidate state "hungry" is "N", "tired" is "Y", and other states are "U"). The topic determination unit 126 determines the topic in the selected topic module 113-16 (for example, "Ask if you want to know how to heal") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“癒す方法を知りたいか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。ここで、図４に示す出力言語情報１１４には、シナリオＳＣ３に相当する出力音声文字列が示されていない（説明の都合上省略していた）が、図１９にはシナリオＳＣ３に相当する出力音声文字列を示している。図２９のようにシナリオＳＣ３に相当する内容が話題モジュールセットに含まれる場合には、話題モジュールセット作成部１２０において図１９にはシナリオＳＣ３に相当する出力音声文字列が出力言語情報１１４に含まれるように生成される。ここで、シナリオＳＣ３に相当する出力音声文字列については図１９を参照して説明する。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "Ask if you want to know how to heal") and the output language information 114. Here, the output speech character string corresponding to scenario SC3 is not shown in the output language information 114 shown in FIG. 4 (this has been omitted for convenience of explanation), but the output speech character string corresponding to scenario SC3 is shown in FIG. 19. When the content corresponding to scenario SC3 is included in the topic module set as in FIG. 29, the topic module set creation unit 120 generates the output speech character string corresponding to scenario SC3 in FIG. 19 so that it is included in the output language information 114. Here, the output speech character string corresponding to scenario SC3 will be explained with reference to FIG. 19.

具体的には、言語生成部１２７は、図１９に示す出力音声文字列を参照し、話題決定部１２６により決定された話題（例えば、“癒す方法を知りたいか聞く”）に対応する出力音声文字列を選択する。図１９に示す例では、決定された話題（例えば、“癒す方法を知りたいか聞く”）に対応する出力音声文字列として“疲れているなら、癒す方法知りたくない？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“疲れているなら、癒す方法知りたくない？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“疲れているなら、癒す方法知りたくない？”という音声がスピーカー４０から出力される（ステップＳ４０５）。 Specifically, the language generation unit 127 refers to the output speech character string shown in FIG. 19 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "asking if you want to know how to heal"). In the example shown in FIG. 19, "If you're tired, don't you want to know how to heal?" is selected as the output speech character string corresponding to the determined topic (e.g., "asking if you want to know how to heal"). The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "If you're tired, don't you want to know how to heal?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the speech "If you're tired, don't you want to know how to heal?" is output from the speaker 40 (step S405).

ユーザは、スピーカー４０から出力された音声に応じて、“知りたい”と発話したとする（ステップＳ４０６）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“知りたい”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user utters "I want to know" in response to the voice output from the speaker 40 (step S406). The voice uttered by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice uttered by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed by the analysis unit 124 through natural language processing. As a result, it is analyzed that the user has uttered "I want to know". The state update unit 125 updates the state information 112 based on the content of the utterance analyzed by the analysis unit 124.

具体的には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報と、解析部１２４により解析された発話内容とを踏まえて、状態情報１１２で示される候補状態の中から該当する候補状態を選択する。例えば、話題決定部１２６から出力された話題に関する情報が“癒す方法を知りたいか聞く”であり、解析部１２４により解析された発話内容が“知りたい”である。そこで、状態更新部１２５は、該当する候補状態として“癒す方法知りたい”を選択する。なお、話題モジュール１１３－ｎにおける話題と、状態情報１１２で示される各候補状態とが予め対応付けられている場合には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報に対応付けられている候補状態を選択すればよい。 Specifically, the state update unit 125 selects a corresponding candidate state from the candidate states indicated in the state information 112, based on the information on the topic output from the topic determination unit 126 and the speech content analyzed by the analysis unit 124. For example, the information on the topic output from the topic determination unit 126 is "Ask if you want to know how to heal," and the speech content analyzed by the analysis unit 124 is "I want to know." Therefore, the state update unit 125 selects "I want to know how to heal" as the corresponding candidate state. Note that, if the topic in the topic module 113-n is associated in advance with each candidate state indicated in the state information 112, the state update unit 125 may select the candidate state associated with the information on the topic output from the topic determination unit 126.

状態更新部１２５は、図３１（Ｄ）に示すように、状態情報１１２で示される状態“癒す方法知りたい”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図３１（Ｄ）に示す通り、状態情報１１２で示される状態“お腹空いている”が“Ｎ”であり、“疲れている”及び“癒す方法知りたい”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 The state update unit 125 updates the variable "U" associated with the state "I want to know how to heal" shown in the state information 112 to "Y" as shown in FIG. 31 (D). When the state information 112 is updated by the state update unit 125, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 31 (D), the variables of each state shown in the state information 112 at this point are "N" for the state "hungry", "Y" for "tired" and "I want to know how to heal", and "U" for other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition satisfied by a combination of variables associated with each candidate state in the state information 112.

例えば、話題決定部１２６は、図３１（Ｄ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態“お腹空いている”が“Ｎ”であり、“疲れている”及び“癒す方法知りたい”が“Ｙ”であり、それ以外の状態は“Ｕ”）の組み合わせであって、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－１４を選択する。話題決定部１２６は、選択した話題モジュール１１３－１４における話題（例えば、“マッサージ店紹介ＯＫか聞く”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 selects the topic module 113-14 with the highest priority from among the topic modules 113-n that contain a condition that satisfies the combination of variables defined as the activation condition and is a combination of variables associated with each candidate state in the state information 112 shown in FIG. 31 (D) (for example, the candidate state "hungry" is "N", "tired" and "want to know how to relax" are "Y", and other states are "U"). The topic determination unit 126 determines the topic in the selected topic module 113-14 (for example, "ask if it's OK to introduce a massage shop") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“マッサージ店紹介ＯＫか聞く”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図１９に示す出力音声文字列を参照し、話題決定部１２６により決定された話題（例えば、“マッサージ店紹介ＯＫか聞く”）に対応する出力音声文字列を選択する。図１９に示す例では、話題決定部１２６により決定された話題（例えば、“マッサージ店紹介ＯＫか聞く”）に対応する出力音声文字列として“癒す方法知りたいなら、マッサージ店紹介してもいい？”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“癒す方法知りたいなら、マッサージ店紹介してもいい？”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“癒す方法知りたいなら、マッサージ店紹介してもいい？”という音声がスピーカー４０から出力される（ステップＳ４０７）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "Ask if it's OK to introduce a massage shop") and the output language information 114. Specifically, the language generation unit 127 refers to the output speech character string shown in FIG. 19 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "Ask if it's OK to introduce a massage shop"). In the example shown in FIG. 19, "If you want to know how to heal, can I introduce you to a massage shop?" is selected as the output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "Ask if it's OK to introduce a massage shop"). The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "If you want to know how to heal, can I introduce you to a massage shop?" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, the voice "If you want to know how to heal, can I introduce you to a massage shop?" is output from the speaker 40 (step S407).

ユーザは、スピーカー４０から出力された音声に応じて、“いいよ”と発話したとする（ステップＳ４０８）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“いいよ”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 The user responds to the voice output from the speaker 40 by saying "good" (step S408). The voice spoken by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice spoken by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed through natural language processing by the analysis unit 124. As a result, it is analyzed that the user has said "good". The state update unit 125 updates the state information 112 based on the content of the utterance analyzed by the analysis unit 124.

具体的には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報と、解析部１２４により解析された発話内容とを踏まえて、状態情報１１２で示される候補状態の中から該当する候補状態を選択する。例えば、話題決定部１２６から出力された話題に関する情報が“マッサージ店紹介ＯＫか聞く”であり、解析部１２４により解析された発話内容が“いいよ”である。そこで、状態更新部１２５は、該当する候補状態として“マッサージ店紹介ＯＫ”を選択する。なお、話題モジュール１１３－ｎにおける話題と、状態情報１１２で示される各候補状態とが予め対応付けられている場合には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報に対応付けられている候補状態を選択すればよい。 Specifically, the state update unit 125 selects a corresponding candidate state from the candidate states indicated in the state information 112, based on the information on the topic output from the topic determination unit 126 and the speech content analyzed by the analysis unit 124. For example, the information on the topic output from the topic determination unit 126 is "Ask if it's OK to introduce a massage shop," and the speech content analyzed by the analysis unit 124 is "Yes." Therefore, the state update unit 125 selects "Massage shop introduction OK" as the corresponding candidate state. Note that, if the topic in the topic module 113-n and each candidate state indicated in the state information 112 are associated in advance, the state update unit 125 may select the candidate state associated with the information on the topic output from the topic determination unit 126.

状態更新部１２５は、図３１（Ｅ）に示すように、状態情報１１２で示される状態“マッサージ店紹介ＯＫ”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図３１（Ｅ）に示す通り、状態情報１１２で示される状態“お腹空いている”が“Ｎ”であり、“疲れている”、“癒す方法知りたい”及び“マッサージ店紹介ＯＫ”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 The state update unit 125 updates the variable "U" associated with the state "Massage shop introduction OK" shown in the state information 112 to "Y" as shown in FIG. 31 (E). When the state information 112 is updated by the state update unit 125, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 31 (E), the variables of each state shown in the state information 112 at this point are "N" for the state "Hungry", "Tired", "Want to know how to heal", and "Massage shop introduction OK" shown in the state information 112, and "U" for the other states. The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112.

例えば、話題決定部１２６は、図３１（Ｅ）に示す状態情報１１２における各候補状態に対応付けられている変数（例えば、候補状態“お腹空いている”が“Ｎ”であり、“疲れている”、“癒す方法知りたい”及び“マッサージ店紹介ＯＫ”が“Ｙ”であり、それ以外の状態は“Ｕ”）の組み合わせであって、起動条件として定義されている変数の組み合わせを満たす条件を含む話題モジュール１１３－ｎのうち、優先順位の最も高い話題モジュール１１３－１２を選択する。話題決定部１２６は、選択した話題モジュール１１３－１２における話題（例えば、“マッサージ店を推薦する”）を、出力対象の話題として決定する。話題決定部１２６は、決定した話題に関する情報を状態更新部１２５に出力する。 For example, the topic determination unit 126 selects the topic module 113-12 with the highest priority from among the topic modules 113-n that contain a condition that satisfies the combination of variables defined as the activation condition, which is a combination of variables associated with each candidate state in the state information 112 shown in FIG. 31 (E) (for example, the candidate state "hungry" is "N", "tired", "want to know how to relax", and "massage shop introduction OK" are "Y", and other states are "U"). The topic determination unit 126 determines the topic in the selected topic module 113-12 (for example, "recommend a massage shop") as the topic to be output. The topic determination unit 126 outputs information related to the determined topic to the state update unit 125.

言語生成部１２７は、話題決定部１２６により決定された話題（例えば、“マッサージ店を推薦する”）と、出力言語情報１１４とに基づいて音声出力させる文字列を生成する。具体的には、言語生成部１２７は、図１９に示す出力音声文字列を参照し、話題決定部１２６により決定された話題（例えば、“マッサージ店を推薦する”）に対応する出力音声文字列を選択する。図１９に示す例では、話題決定部１２６により決定された話題（例えば、“マッサージ店を推薦する”）に対応する出力音声文字列として“マッサージ店紹介ＯＫなら、〇〇っていうマッサージ店がおすすめ”を選択する。音声合成部１２８は、言語生成部１２７により選択された出力音声文字列“マッサージ店紹介ＯＫなら、〇〇っていうマッサージ店がおすすめ”に対応する音声信号を生成し、生成した音声信号を、スピーカー４０を介して出力する。これにより、“マッサージ店紹介ＯＫなら、〇〇っていうマッサージ店がおすすめ”という音声がスピーカー４０から出力される（ステップＳ４０９）。 The language generation unit 127 generates a character string to be output as speech based on the topic determined by the topic determination unit 126 (e.g., "recommend a massage parlor") and the output language information 114. Specifically, the language generation unit 127 refers to the output speech character string shown in FIG. 19 and selects an output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "recommend a massage parlor"). In the example shown in FIG. 19, "If you are OK with a massage parlor introduction, I recommend a massage parlor called XX" is selected as the output speech character string corresponding to the topic determined by the topic determination unit 126 (e.g., "recommend a massage parlor"). The speech synthesis unit 128 generates a speech signal corresponding to the output speech character string "If you are OK with a massage parlor introduction, I recommend a massage parlor called XX" selected by the language generation unit 127, and outputs the generated speech signal via the speaker 40. As a result, a speech "If you are OK with a massage parlor introduction, I recommend a massage parlor called XX" is output from the speaker 40 (step S409).

ユーザは、スピーカー４０から出力された音声に応じて、“分かった”と発話したとする（ステップＳ４１０）。ユーザにより発話された音声はマイク３０を介して対話装置１０に入力される。音声認識部１２３は、音声認識処理によりユーザが発話した音声に対応する文字列を生成する。音声認識部１２３により生成された文字列は、解析部１２４による自然言語処理で解析される。これにより、ユーザが“分かった”と発話したことが解析される。状態更新部１２５は、解析部１２４により解析された発話内容に基づいて、状態情報１１２を更新する。 It is assumed that the user utters "I understand" in response to the voice output from the speaker 40 (step S410). The voice uttered by the user is input to the dialogue device 10 via the microphone 30. The voice recognition unit 123 generates a character string corresponding to the voice uttered by the user through voice recognition processing. The character string generated by the voice recognition unit 123 is analyzed by the analysis unit 124 through natural language processing. As a result, it is analyzed that the user has uttered "I understand". The state update unit 125 updates the state information 112 based on the content of the utterance analyzed by the analysis unit 124.

具体的には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報と、解析部１２４により解析された発話内容とを踏まえて、状態情報１１２で示される候補状態の中から該当する候補状態を選択する。例えば、話題決定部１２６から出力された話題に関する情報が“マッサージ店を推薦する”であり、解析部１２４により解析された発話内容が“分かった”である。そこで、状態更新部１２５は、該当する候補状態として“マッサージ店推薦聞いた”を選択する。なお、話題モジュール１１３－ｎにおける話題と、状態情報１１２で示される各候補状態とが予め対応付けられている場合には、状態更新部１２５は、話題決定部１２６から出力された話題に関する情報に対応付けられている候補状態を選択すればよい。 Specifically, the state update unit 125 selects a corresponding candidate state from the candidate states indicated in the state information 112, based on the information on the topic output from the topic determination unit 126 and the speech content analyzed by the analysis unit 124. For example, the information on the topic output from the topic determination unit 126 is "recommend a massage parlor", and the speech content analyzed by the analysis unit 124 is "understand". Therefore, the state update unit 125 selects "heard a massage parlor recommended" as the corresponding candidate state. Note that, if the topic in the topic module 113-n is associated in advance with each candidate state indicated in the state information 112, the state update unit 125 may select the candidate state associated with the information on the topic output from the topic determination unit 126.

状態更新部１２５は、図３１（Ｆ）に示すように、状態情報１１２で示される状態“マッサージ店推薦聞いた”に対応付けられている変数“Ｕ”を“Ｙ”に更新する。話題決定部１２６は、状態更新部１２５により状態情報１１２が更新されると、更新後の状態情報１１２と、話題モジュールセット１１３とに基づいて次に話すべき話題を決定する。この時点の状態情報１１２で示される各状態の変数は、図３１（Ｆ）に示す通り、状態情報１１２で示される状態“お腹空いている”が“Ｎ”であり、“疲れている”、“癒す方法知りたい”、“マッサージ店紹介ＯＫ”及び“マッサージ店推薦聞いた”が“Ｙ”であり、それ以外の状態は“Ｕ”である。話題決定部１２６は、話題モジュールセット１１３を参照し、状態情報１１２における各候補状態に対応付けられている変数の組み合わせで満たされる条件を含む話題モジュール１１３－ｎであって、かつ、優先順位の最も高い話題モジュール１１３－ｎを選択する。 The state update unit 125 updates the variable "U" associated with the state "I heard about a massage shop recommendation" shown in the state information 112 to "Y" as shown in FIG. 31 (F). When the state information 112 is updated by the state update unit 125, the topic determination unit 126 determines the next topic to talk about based on the updated state information 112 and the topic module set 113. As shown in FIG. 31 (F), the variables of each state shown in the state information 112 at this point are "N" for the state "I'm hungry", "I want to know how to heal", "Massage shop introduction OK", and "I heard about a massage shop recommendation" shown in the state information 112, and "Y" for the other states, and "U". The topic determination unit 126 refers to the topic module set 113 and selects the topic module 113-n with the highest priority that includes a condition that is satisfied by a combination of variables associated with each candidate state in the state information 112.

ところが、現時点においては選択可能な話題がない。この場合、対話装置１０はユーザとの対話を終了する。なお、状態更新部１２５は、他の装置又は機能部からの指示で一部または全ての変数の初期化を行ってもよい。このように、図３０に示す処理では、ユーザから否定的な回答が得られた場合においても、自然に他の話題に転換して対話を続けることができる。 However, at the current time, there is no topic that can be selected. In this case, the dialogue device 10 ends the dialogue with the user. The state update unit 125 may initialize some or all of the variables in response to an instruction from another device or functional unit. In this way, in the process shown in FIG. 30, even if a negative response is obtained from the user, the dialogue can be continued by naturally switching to another topic.

（ユーザと対話装置１０とがテキストによる対話を行う構成）
以上が、対話装置１０が、ユーザの発話内容に応じて話題を決定し、決定された話題による内容を音声出力させる構成の一実施形態の説明である。次に、ユーザと対話装置１０とがテキストによる対話を行う構成について説明する。このように構成される場合、対話システム１００は、マイク３０及びスピーカー４０を備えなくてよい。さらに、対話装置１０は、音声認識部１２３及び音声合成部１２８を備えなくてよい。テキストにより話題に関する内容を出力する手段として、チャットボット等のテキスト出力手段が用いられる。 (Configuration in which a user and the dialogue device 10 have a text-based dialogue)
The above is an explanation of one embodiment of a configuration in which the dialogue device 10 determines a topic according to the content of a user's utterance and outputs the content of the determined topic by voice. Next, a configuration in which the user and the dialogue device 10 have a dialogue through text will be explained. When configured in this way, the dialogue system 100 does not need to include the microphone 30 and the speaker 40. Furthermore, the dialogue device 10 does not need to include the voice recognition unit 123 and the voice synthesis unit 128. A text output means such as a chatbot is used as a means for outputting the content related to the topic by text.

ユーザと対話装置１０とがテキストによる対話を行う場合には、ユーザは、自身が保持するスマートフォン等の通信装置、又は、対話装置１０に接続されるキーボード等の入力装置を介して、対話内容に関する文字列を対話装置１０に入力する。対話装置１０の解析部１２４は、入力された文字列と、記憶部１１に記憶されている辞書１１１とを用いて自然言語処理を行うことでユーザが入力した内容を解析する。解析部１２４による内容の解析から言語生成部１２７による文字列を生成までの処理は、音声及び発話をテキストに置き換えれば処理は上述した処理と同じである。その後、対話装置１０は、不図示の表示制御部により、言語生成部１２７により生成された文字列を表示装置５０に表示させる。表示制御部は、制御部１２で実現される機能である。 When a user and the dialogue device 10 have a text dialogue, the user inputs a character string related to the dialogue content to the dialogue device 10 via a communication device such as a smartphone owned by the user or an input device such as a keyboard connected to the dialogue device 10. The analysis unit 124 of the dialogue device 10 analyzes the content input by the user by performing natural language processing using the input character string and the dictionary 111 stored in the storage unit 11. The process from the analysis of the content by the analysis unit 124 to the generation of the character string by the language generation unit 127 is the same as the process described above, except that voice and speech are replaced with text. After that, the dialogue device 10 causes the display control unit (not shown) to display the character string generated by the language generation unit 127 on the display device 50. The display control unit is a function realized by the control unit 12.

このように構成される場合、聴覚に障害があるユーザにおいても、対話システム１００を利用することが可能になる。このように、対話システム１００の利便性を向上させることが可能になる。 When configured in this way, even users with hearing impairments can use the dialogue system 100. In this way, it is possible to improve the convenience of the dialogue system 100.

（ユーザが音声による対話を行い、対話装置１０がテキストによる対話を行う構成）
ユーザが音声による対話を行い、対話装置１０がテキストによる対話を行う構成について説明する。このように構成される場合、対話システム１００は、スピーカー４０を備えなくてよい。さらに、対話装置１０は、音声合成部１２８を備えなくてよい。ユーザが音声による対話を行い、対話装置１０がテキストによる対話を行う場合には、マイク３０による音声入力から言語生成部１２７による文字列を生成までの処理は、上述した実施形態の処理と同じである。その後、対話装置１０は、不図示の表示制御部により、言語生成部１２７により生成された文字列を表示装置５０に表示させる。表示制御部は、制御部１２で実現される機能である。 (Configuration in which a user conducts a dialogue by voice and the dialogue device 10 conducts a dialogue by text)
A configuration will be described in which a user conducts a dialogue by voice and the dialogue device 10 conducts a dialogue by text. When configured in this manner, the dialogue system 100 does not need to include a speaker 40. Furthermore, the dialogue device 10 does not need to include a voice synthesis unit 128. When a user conducts a dialogue by voice and the dialogue device 10 conducts a dialogue by text, the process from voice input by the microphone 30 to generation of a character string by the language generation unit 127 is the same as the process in the above-mentioned embodiment. Thereafter, the dialogue device 10 causes a display control unit (not shown) to display the character string generated by the language generation unit 127 on the display device 50. The display control unit is a function realized by the control unit 12.

（ユーザがテキストによる対話を行い、対話装置１０が音声出力による対話を行う構成）
ユーザがテキストによる対話を行い、対話装置１０が音声出力による対話を行う構成について説明する。このように構成される場合、対話システム１００は、マイク３０を備えなくてよい。さらに、対話装置１０は、音声認識部１２３を備えなくてよい。ユーザがテキストによる対話を行い、対話装置１０が音声出力による対話を行う場合には、ユーザは、自身が保持するスマートフォン等の通信装置、又は、対話装置１０に接続されるキーボード等の入力装置を介して、対話内容に関する文字列を対話装置１０に入力する。対話装置１０の解析部１２４は、入力された文字列と、記憶部１１に記憶されている辞書１１１とを用いて自然言語処理を行うことでユーザが入力した内容を解析する。以降の処理は、上述した実施形態に記載の処理と同じである。 (Configuration in which a user interacts with a text and the interaction device 10 interacts with the text by voice output)
A configuration will be described in which a user has a dialogue using text and the dialogue device 10 has a dialogue using voice output. In this configuration, the dialogue system 100 does not need to include a microphone 30. Furthermore, the dialogue device 10 does not need to include a voice recognition unit 123. In the case where a user has a dialogue using text and the dialogue device 10 has a dialogue using voice output, the user inputs a character string related to the dialogue content to the dialogue device 10 via a communication device such as a smartphone held by the user or an input device such as a keyboard connected to the dialogue device 10. The analysis unit 124 of the dialogue device 10 analyzes the content input by the user by performing natural language processing using the input character string and the dictionary 111 stored in the storage unit 11. The subsequent processing is the same as the processing described in the above-mentioned embodiment.

このように構成される場合、発話が困難なユーザにおいても、対話システム１００を利用することが可能になる。このように、対話システム１００の利便性を向上させることが可能になる。 When configured in this manner, even users who have difficulty speaking can use the dialogue system 100. In this way, it is possible to improve the convenience of the dialogue system 100.

以上のように構成された対話システム１００によれば、対話装置１０の話題モジュールセット作成部１２０が、一方向の対話の流れが記述された１以上のシナリオに基づいて、所定の条件が満たされた場合に対話の相手であるユーザに提供する話題が示された話題モジュールを複数含む話題モジュールセットを作成する。対話装置１０は、状態情報１１２と話題モジュールセット１１３とに基づいて、ユーザの状態に応じた話題を決定し、決定された話題による内容を出力させる。このように、対話装置１０では、状態情報１１２を参照して、複数の話題モジュール１１３－ｎの中から起動条件を満たした話題による内容を出力することができる。したがって、ユーザの状態に応じた話題による内容を出力することができる。さらに、対話装置１０は、起動条件を満たしたいずれかの話題による内容を出力することができるため、複雑な話題の遷移パターンを想定しきれていない場合であっても対話を継続させることが可能になる。 According to the dialogue system 100 configured as above, the topic module set creation unit 120 of the dialogue device 10 creates a topic module set including multiple topic modules indicating topics to be provided to the user who is the dialogue partner when a predetermined condition is satisfied, based on one or more scenarios describing a one-way dialogue flow. The dialogue device 10 determines a topic according to the user's state based on the state information 112 and the topic module set 113, and outputs content based on the determined topic. In this way, the dialogue device 10 can refer to the state information 112 and output content based on a topic that satisfies the activation condition from among multiple topic modules 113-n. Therefore, content based on a topic according to the user's state can be output. Furthermore, since the dialogue device 10 can output content based on any topic that satisfies the activation condition, it becomes possible to continue the dialogue even when complex topic transition patterns cannot be fully anticipated.

上述したように、対話装置１０は、一方向の対話の流れが記述された１以上のシナリオＳＣを用いる。これにより、設計者は、複雑な分岐を含むシナリオＳＣを作成する必要がなく、単に一方向の対話の流れが記述された１以上のシナリオＳＣを作成すればよい。そして、対話装置１０は、一方向の対話の流れが記述された１以上のシナリオＳＣを用いることで容易に話題モジュールセットを作成することができる。 As described above, the dialogue device 10 uses one or more scenarios SC in which a one-way dialogue flow is described. This eliminates the need for the designer to create a scenario SC that includes complex branches, and simply creates one or more scenarios SC in which a one-way dialogue flow is described. The dialogue device 10 can easily create a topic module set by using one or more scenarios SC in which a one-way dialogue flow is described.

対話装置１０は、ユーザの発話内容又はテキストにより入力された内容に応じて状態情報１１２を更新する。これにより、対話履歴を加味した話題を決定することができる。したがって、対話装置１０は、過去の会話と関係ない話題を選択してしまう確率を低減することができる。そのため、対話を継続させることが可能になる。 The dialogue device 10 updates the state information 112 according to the content of the user's utterance or the content input by text. This allows the dialogue device 10 to determine a topic that takes into account the dialogue history. Therefore, the dialogue device 10 can reduce the probability of selecting a topic that is unrelated to past conversations. This makes it possible to continue the dialogue.

話題モジュールセットは、定められた優先順位で各話題モジュールが階層構造に配置されており、対話装置１０は、状態情報に基づいて、満たされた起動条件に対応付けられた話題のうち、優先順位の最も高い話題をユーザの状態に応じた話題として決定する。これにより、設計者の意図に沿って対話を進めることができる。 The topic module set is arranged in a hierarchical structure with each topic module being prioritized according to a set priority, and the dialogue device 10 determines, based on the state information, the topic with the highest priority among the topics associated with the satisfied activation conditions as the topic that corresponds to the user's state. This allows the dialogue to proceed in line with the designer's intentions.

＜変形例１＞
話題モジュールセット１１３を構成する話題モジュール１１３－ｎの並び順は、処理の前後又は処理の途中で変更されてもよい。例えば、対話装置１０は、処理開始時において話題モジュール１１３－ｎの並び順が図３の並び順であったとして、処理の途中又は処理の終了後に話題モジュール１１３－ｎの並び順を図２２の並び順に変更してもよい。このように構成される場合、対話装置１０は、話題モジュール１１３－ｎの並び順に関する情報を複数保持しておき、並び替え条件が満たされたタイミングで話題モジュール１１３－ｎの並び順を変更すればよい。並び替え条件は、例えば１つの処理（例えば、図２０、図２３及び図２７等の処理）が終了することであってもよいし、予め定められた時刻になったことであってもよいし、外部から変更の指示がなされたことであってもよい。
このように構成されることによって、対話の進め方の自由度を広げることができる。そのため、利便性を向上させることが可能になる。 <Modification 1>
The order of the topic modules 113-n constituting the topic module set 113 may be changed before, after, or during processing. For example, assuming that the order of the topic modules 113-n is the order of FIG. 3 at the start of processing, the dialogue device 10 may change the order of the topic modules 113-n to the order of FIG. 22 during or after the end of processing. When configured in this way, the dialogue device 10 may hold a plurality of pieces of information regarding the order of the topic modules 113-n, and change the order of the topic modules 113-n at the timing when a sorting condition is satisfied. The sorting condition may be, for example, the end of one process (for example, the process of FIG. 20, FIG. 23, FIG. 27, etc.), the arrival of a predetermined time, or an instruction to change from the outside.
Such a configuration allows for greater freedom in how the dialogue proceeds, thereby improving convenience.

＜変形例２＞
上述した対話システム１００では、表示装置５０に二次元で表現されたエージェントを表示して、エージェントが話しかけているように見せていた。これに対して、表示装置５０に代えて、対話装置１０の近傍にロボットを設置し、ロボットが話しかけているように構成されてもよい。図３０は、変形例における対話システム１００ａの構成の一例を示す図である。対話システム１００ａは、対話装置１０ａと、カメラ２０と、マイク３０と、スピーカー４０と、ロボット６０とを備える。カメラ２０と、マイク３０と、スピーカー４０と、ロボット６０とは、有線又は無線により対話装置１０ａに接続される。 <Modification 2>
In the dialogue system 100 described above, a two-dimensional agent is displayed on the display device 50, and it appears as if the agent is speaking. In contrast, instead of the display device 50, a robot may be installed near the dialogue device 10, and it may be configured so that the robot is speaking. Fig. 30 is a diagram showing an example of the configuration of a dialogue system 100a in a modified example. The dialogue system 100a includes a dialogue device 10a, a camera 20, a microphone 30, a speaker 40, and a robot 60. The camera 20, the microphone 30, the speaker 40, and the robot 60 are connected to the dialogue device 10a by wire or wirelessly.

ロボット６０は、対話装置１０ａによって送信された制御情報に応じて、各駆動機構や発光部、スピーカー又はカメラ等のロボット６０に設けられた機能を制御することによって、所定の動作を実行する。例えば、ロボット６０は、首、肩又は腕の各関節部に設けられた駆動機構を作動することによって動作する。ロボット６０は、例えば、肩又は脚等の各関節部に設けられた駆動機構を作動して歩行する動物の形状であってもよい。ロボット６０は、肩又は脚等の各関節部に設けられた駆動機構を作動して自立歩行する二足歩行等のロボット（ヒューマノイド）であってもよい。ロボット６０は、車輪又は無限軌道で移動できるような移動型ロボット（エージェント化されたロボット）であってもよい。ロボット６０は、例えばテーブルや受付台等の板状の台の上に設置される。 The robot 60 performs a predetermined operation by controlling the functions provided in the robot 60, such as each drive mechanism, light emitting unit, speaker, or camera, according to the control information transmitted by the dialogue device 10a. For example, the robot 60 operates by activating a drive mechanism provided in each joint of the neck, shoulder, or arm. The robot 60 may be in the form of an animal that walks by activating a drive mechanism provided in each joint of the shoulder or leg. The robot 60 may be a bipedal robot (humanoid) that walks independently by activating a drive mechanism provided in each joint of the shoulder or leg. The robot 60 may be a mobile robot (agentized robot) that can move on wheels or tracks. The robot 60 is placed on a plate-shaped platform, such as a table or a reception desk.

対話装置１０ａは、記憶部１１ａと、制御部１２ａとを備える。記憶部１１ａには、辞書１１１、状態情報１１２、話題モジュールセット１１３、出力言語情報１１４及び動作制御情報１１５ａ等が記憶される。記憶部１１ａは、磁気記憶装置や半導体記憶装置などの記憶装置を用いて構成される。動作制御情報１１５ａは、ロボット６０を制御するための情報を含む。例えば、動作制御情報１１５ａは、話題又は出力音声文字列と、制御内容とが対応付けられたテーブルであってもよい。制御内容は、各駆動機構や発光部、スピーカー又はカメラ等のロボット６０に設けられた機能を制御するための内容である。 The dialogue device 10a includes a memory unit 11a and a control unit 12a. The memory unit 11a stores a dictionary 111, state information 112, a topic module set 113, output language information 114, and operation control information 115a. The memory unit 11a is configured using a storage device such as a magnetic storage device or a semiconductor storage device. The operation control information 115a includes information for controlling the robot 60. For example, the operation control information 115a may be a table in which a topic or an output voice character string is associated with a control content. The control content is content for controlling the functions provided in the robot 60, such as each drive mechanism, a light-emitting unit, a speaker, or a camera.

制御部１２ａは、対話装置１０ａ全体を制御する。制御部１２ａは、ＣＰＵ等のプロセッサやメモリを用いて構成される。制御部１２ａは、プログラムを実行することによって、話題モジュールセット作成部１２０と、状態情報作成部１２１と、検出部１２２と、音声認識部１２３と、解析部１２４と、状態更新部１２５と、話題決定部１２６と、言語生成部１２７と、音声合成部１２８と、動作制御部１２９ａの機能を実現する。 The control unit 12a controls the entire dialogue device 10a. The control unit 12a is configured using a processor such as a CPU and a memory. By executing a program, the control unit 12a realizes the functions of the topic module set creation unit 120, the state information creation unit 121, the detection unit 122, the voice recognition unit 123, the analysis unit 124, the state update unit 125, the topic determination unit 126, the language generation unit 127, the voice synthesis unit 128, and the operation control unit 129a.

話題モジュールセット作成部１２０、状態情報作成部１２１、検出部１２２、音声認識部１２３、解析部１２４、状態更新部１２５、話題決定部１２６、言語生成部１２７、音声合成部１２８及び動作制御部１２９ａのうち一部または全部は、ＡＳＩＣやＰＬＤ、ＦＰＧＡなどのハードウェア（回路部；circuitryを含む）によって実現されてもよいし、ソフトウェアとハードウェアとの協働によって実現されてもよい。プログラムは、コンピュータ読み取り可能な記録媒体に記録されてもよい。コンピュータ読み取り可能な記録媒体とは、例えばフレキシブルディスク、光磁気ディスク、ＲＯＭ、ＣＤ－ＲＯＭ等の可搬媒体、コンピュータシステムに内蔵されるハードディスク等の記憶装置などの非一時的な記憶媒体である。プログラムは、電気通信回線を介して送信されてもよい。 Some or all of the topic module set creation unit 120, state information creation unit 121, detection unit 122, voice recognition unit 123, analysis unit 124, state update unit 125, topic determination unit 126, language generation unit 127, voice synthesis unit 128, and operation control unit 129a may be realized by hardware (including circuitry) such as an ASIC, PLD, or FPGA, or by a combination of software and hardware. The program may be recorded on a computer-readable recording medium. A computer-readable recording medium is, for example, a non-transitory storage medium such as a portable medium such as a flexible disk, optical magnetic disk, ROM, or CD-ROM, or a storage device such as a hard disk built into a computer system. The program may be transmitted via a telecommunications line.

話題モジュールセット作成部１２０、状態情報作成部１２１、検出部１２２、音声認識部１２３、解析部１２４、状態更新部１２５、話題決定部１２６、言語生成部１２７、音声合成部１２８及び動作制御部１２９ａの機能の一部は、予め対話装置１０ａに搭載されている必要はなく、追加のアプリケーションプログラムが対話装置１０ａにインストールされることで実現されてもよい。 Some of the functions of the topic module set creation unit 120, state information creation unit 121, detection unit 122, voice recognition unit 123, analysis unit 124, state update unit 125, topic determination unit 126, language generation unit 127, voice synthesis unit 128, and operation control unit 129a do not need to be pre-installed in the dialogue device 10a, and may be realized by installing additional application programs in the dialogue device 10a.

動作制御部１２９ａは、話題決定部１２６により決定された話題又は話題に基づく情報と、動作制御情報１１５ａとに基づいて、ロボット６０の動作を制御する。具体的には、動作制御部１２９は、動作制御情報１１５ａを参照し、話題決定部１２６により決定された話題に対応付けられた制御内容を取得する。動作制御部１２９ａは、取得した制御内容を実行させるための制御情報を生成する。動作制御部１２９ａは、生成した制御情報をロボット６０に出力することによって、ロボット６０の動作を制御する。 The operation control unit 129a controls the operation of the robot 60 based on the topic or information based on the topic determined by the topic determination unit 126 and the operation control information 115a. Specifically, the operation control unit 129 refers to the operation control information 115a and acquires control content associated with the topic determined by the topic determination unit 126. The operation control unit 129a generates control information for executing the acquired control content. The operation control unit 129a controls the operation of the robot 60 by outputting the generated control information to the robot 60.

なお、動作制御情報１１５ａとして、出力音声文字列と、制御内容とが対応付けられたテーブルが用いられる場合、動作制御部１２９ａは、出力言語情報１１４を参照し、話題決定部１２６により決定された話題に対応付けられた出力音声文字列を取得する。動作制御部１２９ａは、動作制御情報１１５ａを参照し、取得した出力音声文字列に対応付けられた制御内容を取得する。動作制御部１２９ａは、取得した制御内容を実行させるための制御情報を生成する。動作制御部１２９ａは、生成した制御情報をロボット６０に出力することによって、ロボット６０の動作を制御する。 When a table in which output voice character strings and control contents are associated is used as the operation control information 115a, the operation control unit 129a refers to the output language information 114 and acquires the output voice character string associated with the topic determined by the topic determination unit 126. The operation control unit 129a refers to the operation control information 115a and acquires the control contents associated with the acquired output voice character string. The operation control unit 129a generates control information for executing the acquired control contents. The operation control unit 129a controls the operation of the robot 60 by outputting the generated control information to the robot 60.

＜変形例３＞
上述した実施形態では、ある話題に関するユーザ状態の情報が既に得られている場合でも、処理の流れによってはその話題に関する内容を音声出力してしまい、対話として不自然になる可能性がある。このような現象は、話題をスキップしたことにより、途中に確認すべき話題を飛ばして先の話題に関する内容を聞いてしまった際に起こりうる。例えば、シーケンス中の「パスタ屋知りたい？」という話題へのユーザ回答が既に得られているとする。上述した構成では、そのような状況であっても、「パスタ屋知りたい？」という話題の前に想定された「パスタ食べたい？」や「お腹空いた？」の話題モジュール１１３－ｎが起動条件を満たしている場合、それらの話題が選択・出力される可能性がある。したがって、「お腹空いた？」→「パスタ食べたい？」というようなパスタ店の紹介が予想できるような話題展開にもかかわらず、本来それらの次に選択される「パスタ屋知りたい？」は選択されずに、「ラーメン食べたい？」のような別の話題シーケンスのものになってしまうことがあり、ユーザがその話題展開を拍子抜けで不自然だと感じる可能性がある。 <Modification 3>
In the above embodiment, even if information on the user's state regarding a certain topic has already been obtained, depending on the flow of processing, the content of that topic may be output as voice, resulting in an unnatural dialogue. This phenomenon may occur when a topic is skipped and the content of the previous topic is heard instead of the topic that should be confirmed. For example, assume that the user's answer to the topic "Want to know about pasta restaurants?" in the sequence has already been obtained. In the above configuration, even in such a situation, if the topic modules 113-n of "Want to eat pasta?" and "Are you hungry?", which are assumed before the topic "Want to know about pasta restaurants?", satisfy the activation conditions, those topics may be selected and output. Therefore, even if the topic development is such that an introduction of a pasta restaurant can be predicted, such as "Are you hungry?" → "Want to eat pasta?", the topic "Want to know about pasta restaurants?", which should be selected next, may not be selected, and may become a different topic sequence such as "Want to eat ramen?", which may cause the user to feel that the topic development is disappointing and unnatural.

そこで、上述した実施形態において、各話題モジュール１１３－ｎの起動条件として、シーケンス中でその話題モジュール１１３－ｎよりも後に登場する話題への回答が埋まっていないことを＆で追加する方法がある。例えば、「パスタ食べたい？」という話題を持つ話題モジュール１１３－ｎは一つ後の話題の回答を記録する状態変数「パスタ屋知りたい」＝初期値という条件を＆で追加する。同様に、「お腹空いた？」という話題を持つ話題モジュール１１３－ｎは、それ以降の話題の回答を記録する状態変数「パスタ食べたい」＝初期値、「パスタ屋知りたい」＝初期値という条件を＆で追加する。これらの起動条件よって、各シーケンスで後に登場する話題へのユーザ回答が既に得られている場合、それ以前の話題は選択されなくすることができる。 Therefore, in the above-described embodiment, there is a method of adding, with &, as a start condition for each topic module 113-n, that the answer to the topic that appears after topic module 113-n in the sequence is not filled in. For example, topic module 113-n having the topic "Do you want to eat pasta?" adds, with &, a condition that the state variable that records the answer to the subsequent topic, "I want to know where to find pasta restaurants" = initial value. Similarly, topic module 113-n having the topic "Are you hungry?" adds, with &, a condition that the state variable that records the answer to the subsequent topic, "I want to eat pasta" = initial value, "I want to know where to find pasta restaurants" = initial value. These start conditions make it possible to prevent previous topics from being selected when a user answer to a topic that appears later in each sequence has already been obtained.

＜変形例４＞
上述した実施形態では、ユーザからの話題転換に対応できない場合がある。これは、上述した話題の選択ルールでは必ずしもユーザの直前の発話内容に基づいて対話装置１０、１０ａにおいて次の話題が選択されるとは限らないためである。具体的には、現在選択されている話題モジュール１３－ｎよりも優先度の低い話題モジュール１３－ｎを起動させるようなユーザの発話内容が得られた場合、対話装置１０、１０ａは直前の話題をそのまま繰り返すため、ユーザの直前の発話を反映した話題は選択できない。例えば、あるシーケンスの最後のステップに該当する話題「パスタ屋知りたい？」を実行中である場合を考える。この時、ユーザが話題を転換して、別のシーケンスの話題の根拠となるような「ラーメンも食べたい」という趣旨の発言をしたとしても、次のターンでは階層構造でより優先度の高い現在の話題「パスタ屋知りたい？」が優先されるため、直前のユーザの発話「ラーメンも食べたい」を即座に反映した話題変更は行なわれない。その結果、対話装置１０、１０ａとしては、ユーザの話題転換の意図を受け付けず、対話装置１０、１０ａ自身の意図を優先させて話題を提示するように振舞ってしまうことになる。このような状況では、ユーザの対話意欲を低減させる可能性が高い。その一方で、ユーザの発言を常に踏まえ続けていると、話題誘導が全く達成できない可能性もある。一定の割合で対話装置１０、１０ａの意図を押し通すことが、対話システム１００における意図や欲求が強調され、対話感の向上に繋がる可能性もある。 <Modification 4>
In the above-mentioned embodiment, there are cases where it is not possible to respond to a change of topic from the user. This is because the above-mentioned topic selection rule does not necessarily allow the dialogue device 10, 10a to select the next topic based on the content of the user's previous utterance. Specifically, when the user's utterance content is obtained such that a topic module 13-n with a lower priority than the currently selected topic module 13-n is activated, the dialogue device 10, 10a repeats the previous topic as it is, and therefore cannot select a topic that reflects the user's previous utterance. For example, consider a case where a topic "Want to know about pasta restaurants?" that corresponds to the last step of a certain sequence is being executed. At this time, even if the user changes the topic and makes a statement to the effect that "I also want to eat ramen", which is the basis for the topic of another sequence, the current topic "Want to know about pasta restaurants?", which has a higher priority in the hierarchical structure, takes precedence in the next turn, so that the topic change that immediately reflects the user's previous utterance "I also want to eat ramen" is not performed. As a result, the dialogue device 10, 10a behaves as if it does not accept the user's intention to change the topic, and presents a topic by prioritizing the dialogue device 10, 10a's own intention. In such a situation, it is highly likely that the user's willingness to engage in dialogue will be reduced. On the other hand, if the user's remarks are constantly taken into account, it may be impossible to achieve topic guidance at all. Pushing through the intentions of the dialogue device 10, 10a to a certain extent may emphasize the intentions and desires of the dialogue system 100, leading to an improved sense of dialogue.

そこで、話題決定部１２６は、状態情報１１２における変数がアップデートされたことを知らせる発話（例えば、「ラーメン食べたい」の変数が“Ｙ”にアップデートされた際に、「そっか、ラーメン食べたいんだ」との発話）を行うように話題を決定してもよい。これによって、「そっか、ラーメン食べたいんだ。えっと、今の話だけど、パスタは食べたい？」のように、対話装置１０、１０ａがユーザからの話題転換に応じない場合でも、ユーザの発言自体は理解できていることをユーザに示せるため、対話感を維持することができる。 The topic determination unit 126 may determine the topic to be spoken to inform the user that a variable in the state information 112 has been updated (for example, when the variable "I want to eat ramen" is updated to "Y", the topic determination unit 126 may utter "I see, so you want to eat ramen."). This allows the user to understand what the user has said, even if the dialogue device 10, 10a does not respond to a request to change the topic from the user, such as "I see, so you want to eat ramen. Um, on that note, do you want pasta?", and therefore maintains a sense of dialogue.

＜変形例５＞
上述した実施形態では、継続的にターゲットの話題に可能な限り早く辿り着けるような話題を必ず選択する。しかし、これが何度も続くと、対話システム１００側に何かしらの目標の話題へ誘導しようとする意図(例えば、デートの誘い、宣伝)があるとユーザが感づく可能性がある。例えば、宣伝においては、その意図を隠すことが宣伝成功の可否にもかかわるため、この振る舞いによって宣伝の成功率が低下する恐れがある。この振る舞いの改善のために、例えば、多くの時間は宣伝の意図性が低い話題で雑談的に振る舞い、稀に宣伝に誘導する話題展開を行うというような制御が考えられる。 <Modification 5>
In the above-described embodiment, a topic that allows the user to reach the target topic as quickly as possible is always selected. However, if this continues many times, the user may sense that the dialogue system 100 has an intention to lead the user to some kind of target topic (e.g., a date invitation, advertisement). For example, in the case of advertisement, hiding the intention affects the success of the advertisement, so this behavior may reduce the success rate of the advertisement. To improve this behavior, for example, control may be considered in which the system behaves in a chatty manner on topics with low advertisement intention most of the time, and occasionally develops a topic that leads the user to the advertisement.

そこで、このような機能を実現するための方法として、二つの方法について説明する。一つは、話題モジュール群を設定し、それらの優先順位をダイナミックに入れ替える方法である。話題モジュール群とは、同様の機能を持つ複数の話題モジュール１１３－ｎをまとめて一つのグループとして捉えた単位である。例えば、各シーケンスの序盤の話題で構成された話題モジュール１１３－ｎの階層をまとめて雑談話題モジュール群、シーケンス終盤の話題の階層構造を宣伝話題モジュール群と定義する。対話開始時や宣伝完了時から一定の話題数をこなすまでは、雑談話題モジュール群の優先順位を宣伝話題モジュール群よりも高く設定する。その後、一定の話題数をこなすと、宣伝話題モジュール群を相対的に高い優先順位に変更する。これによって、多くの時間は宣伝の意図性が低い各シーケンスの序盤の話題を集中的に選び続け、比較的小さい頻度でターゲットの話題へ誘導するシーケンスを実行することが可能となる。 Two methods for implementing such a function are described below. One is to set up a topic module group and dynamically change their priority. A topic module group is a unit that groups together multiple topic modules 113-n with similar functions. For example, the hierarchy of topic modules 113-n that are composed of topics at the beginning of each sequence is defined as a chat topic module group, and the hierarchical structure of topics at the end of the sequence is defined as an advertising topic module group. Until a certain number of topics are covered from the start of the dialogue or the end of the advertising, the priority of the chat topic module group is set higher than that of the advertising topic module group. After that, when a certain number of topics are covered, the priority of the advertising topic module group is changed to a relatively high priority. This makes it possible to continue to focus on choosing topics at the beginning of each sequence, which have low advertising intent, for most of the time, and to execute a sequence that leads to the target topic at a relatively low frequency.

もう一つは、雑談を目的とした話題戦略シーケンスを設計し、それらのシーケンスの優先順位を高く設定するというものである。これによって、対話の初期には雑談を目的にしたシーケンスが実行・消費され、その後に宣伝を目的にしたシーケンスが実行される、といった振る舞いを実現可能である。 The other is to design topic strategy sequences aimed at small talk and set the priority of those sequences high. This makes it possible to realize behavior in which sequences aimed at small talk are executed and consumed early in the conversation, and then sequences aimed at advertising are executed afterwards.

＜変形例６＞
上述した実施形態では、話題誘導対話を対象としている一方で、店舗の場所や商品の特徴など、肝心の宣伝の内容やユーザからの質問への回答などといった話題は対象としていない。そこで、対話装置１０、１０ａにおいて、このような話題を音声出力するように構成されてもよい。ユーザからの質問への回答は、対話装置１０、１０ａが提示する他のどの話題よりも常に優先されるのが望ましい。そこで、対話装置１０、１０ａにおいて、上述した話題モジュールセット１１３（例えば、話題誘導対話のモジュールセット）のより上位に、回答モジュールセットを配置する。回答話題モジュールとは、ユーザからの質問を検知した際に、話題モジュールセット１１３と同様に、当てはまる話題（回答）を出力するものである。回答話題モジュールを話題誘導対話の各モジュールよりも上位に配置することで、回答話題モジュールの出力が常に話題誘導対話の出力よりも優先されて選択されるため、ユーザからの質問にいつでも回答できるようになる。 <Modification 6>
In the above-described embodiment, while the topic-guiding dialogue is targeted, topics such as the essential contents of advertisements, such as the location of a store or the features of a product, or answers to questions from users are not targeted. Therefore, the dialogue device 10, 10a may be configured to output such topics by voice. It is desirable that answers to questions from users are always given priority over any other topics presented by the dialogue device 10, 10a. Therefore, in the dialogue device 10, 10a, an answer module set is placed higher than the above-described topic module set 113 (for example, a module set of topic-guiding dialogue). The answer topic module outputs a relevant topic (answer) when a question from a user is detected, similar to the topic module set 113. By placing the answer topic module higher than each module of the topic-guiding dialogue, the output of the answer topic module is always selected with priority over the output of the topic-guiding dialogue, so that questions from users can be answered at any time.

上述した実施形態では、一度回答が得られた話題は選択されないようになっているが、時間の経過や対話の流れの中で、同じ話題を再び選びたい場合があると想定される。例えば、「お腹空いた？」「お疲れですか？」という質問は、一度回答を得ていたとしても、一定の時間経過で再び行いたい話題である。そこで、対話装置１０、１０ａにおいて所定の時間が経過したタイミングで、状態情報１１２で示される変数の一部又は全てを初期化するように構成されてもよい。 In the above-described embodiment, a topic for which an answer has been given is not selected, but it is assumed that there may be cases where the same topic is desired to be selected again over time or as the conversation flows. For example, questions such as "Are you hungry?" and "Are you tired?" are topics that may be desired to be asked again after a certain amount of time has passed, even if an answer has been given once. Thus, the dialogue device 10, 10a may be configured to initialize some or all of the variables indicated in the state information 112 when a predetermined amount of time has passed.

＜変形例７＞
上述した実施形態では、ロボットや表示装置に表示されたエージェントから話題に応じた内容をユーザに提供する構成を示したが、話題に応じた内容をユーザに提供する手段としては、他の手段が用いられてもよい。他の手段として、例えばスマートスピーカー（音声出力装置）が用いられてもよい。スマートスピーカーが用いられる場合、音声合成部１２８は、決定された話題による内容をスマートスピーカーから出力させる。スマートスピーカーが用いられる場合、対話システム１００においてはカメラ２０、スピーカー４０及び表示装置５０を備えなくてよく、対話システム１００ａにおいてはカメラ２０及びスピーカー４０を備えなくてよい。対話システム１００，１００ａによる対話の開始は、ユーザからの発話をスマートスピーカーで認識したことを契機に実行される。このように構成される場合、動作制御部１２９、１２９ａは、スマートスピーカーの動作を制御する。 <Modification 7>
In the above-described embodiment, a configuration in which the robot or the agent displayed on the display device provides the user with content according to the topic has been shown, but other means may be used as a means for providing the user with content according to the topic. As another means, for example, a smart speaker (voice output device) may be used. When a smart speaker is used, the voice synthesis unit 128 outputs content according to the determined topic from the smart speaker. When a smart speaker is used, the dialogue system 100 does not need to include the camera 20, the speaker 40, and the display device 50, and the dialogue system 100a does not need to include the camera 20 and the speaker 40. The dialogue by the dialogue systems 100 and 100a is started when the smart speaker recognizes an utterance from the user. When configured in this way, the operation control unit 129 and 129a control the operation of the smart speaker.

上述した構成では、ユーザの発話内容又はテキストにより入力された内容に基づいて話題を決定する構成を示した。話題決定部１２６は、ユーザの発話内容又はテキストにより入力された内容の他に、ユーザの動作を加味して話題を決定するように構成されてもよい。このように構成される場合、ユーザの動作を起動条件とし、ユーザの動作に関する起動条件が満たされた場合にユーザに提供する話題を起動条件に対応付けた話題モジュールが必要になる。さらに、状態情報１１２には、ユーザの動作を状態とした情報が設定されることになる。 In the above configuration, a configuration has been shown in which a topic is determined based on the user's speech content or the content input by text. The topic determination unit 126 may be configured to determine a topic by taking into account the user's actions in addition to the user's speech content or the content input by text. When configured in this way, a topic module is required in which the user's actions are used as activation conditions and a topic to be provided to the user when the activation conditions related to the user's actions are satisfied is associated with the activation conditions. Furthermore, information regarding the user's actions as a state is set in the state information 112.

ユーザの動作を起動条件とした話題モジュールとしては、起動条件として、例えば、「ロボットに手が伸びてきた」、「ロボットの電源を抜こうとしている」等のロボットに対するユーザの動作が設定される。さらに、起動条件が満たされた場合の話題として、例えば、「注意する」等のユーザの動作に対するロボットの危機回避用の話題が設定される。「注意する」という話題に応じた出力音声文字列として、出力言語情報１１４には「触らないで」や「抜かないで」等が設定される。 For a topic module that uses a user's action as a start condition, the user's action against the robot, such as "a hand is reaching out towards the robot" or "trying to unplug the robot", is set as the start condition. Furthermore, as a topic to be discussed when the start condition is met, a topic for the robot to avoid danger in response to the user's action, such as "be careful". As an output speech string corresponding to the topic "be careful", "don't touch me" or "don't unplug me" is set in the output language information 114.

このように作成されたロボットにとっての危機回避用の話題モジュールを話題モジュールセット１１３において最優先に配置することで、ロボット６０に手が伸びてきたという動作が検出部１２２により検知されると、状態更新部１２５は、検出部１２２により検知されたユーザの動作に基づいて状態情報１１２を更新する。話題決定部１２６は、話題モジュールセット１１３を参照し、「ロボットに手が伸びてきた」が“Ｙ”である起動条件に対応付けられた「注意する」を話題として決定する。そして、言語生成部１２７は、決定された話題「注意する」に対応付けられた「触らないで」という内容を音声又はテキストにより出力させる。 By placing the topic module for crisis avoidance for the robot created in this way as the top priority in the topic module set 113, when the detection unit 122 detects an action of a hand reaching out towards the robot 60, the state update unit 125 updates the state information 112 based on the user's action detected by the detection unit 122. The topic determination unit 126 refers to the topic module set 113 and determines the topic to be "Caution", which is associated with the activation condition for which "A hand is reaching out towards the robot" is "Y". The language generation unit 127 then outputs the content "Don't touch me", which is associated with the determined topic "Caution", as voice or text.

同様に、ロボット６０の電源を抜こうとしているという動作が検出部１２２により検知されると、状態更新部１２５は、検出部１２２により検知されたユーザの動作に基づいて状態情報１１２を更新する。話題決定部１２６は、話題モジュールセット１１３を参照し、「ロボットの電源を抜こうとしている」が“Ｙ”である起動条件に対応付けられた「注意する」を話題として決定する。そして、言語生成部１２７は、決定された話題「注意する」に対応付けられた「抜かないで」という内容を音声又はテキストにより出力させる。ユーザの動作は、上述したように検出部１２２により検知される。 Similarly, when the detection unit 122 detects an action of trying to unplug the robot 60, the state update unit 125 updates the state information 112 based on the user's action detected by the detection unit 122. The topic determination unit 126 refers to the topic module set 113 and determines the topic to be "Caution", which is associated with the activation condition for which "trying to unplug the robot" is "Y". The language generation unit 127 then outputs the content "Don't unplug" which is associated with the determined topic "Caution" by voice or text. The user's action is detected by the detection unit 122 as described above.

別例として、ユーザの動作を起動条件とした話題モジュールとして以下のような構成が用いられてもよい。例えば、起動条件として、例えば、「ロボット６０や表示装置５０に表示されたエージェントに対して手を振ってきた」、「ロボット６０や表示装置５０に表示されたエージェントに対して顔を近づけてきた」等のロボットに対するユーザの動作が設定され、起動条件が満たされた場合の話題として、例えば、「ロボット、エージェント又は音声出力装置のいずれかの動作を制御する」等のユーザの動作に対する話題が設定された話題モジュールが用いられてもよい。「ロボット、エージェント又は音声出力装置のいずれかの動作を制御する」という話題に応じた動作制御として、「手を振りかえす」や「首を傾げる」等が挙げられる。なお、話題に応じた動作制御については、動作制御情報１１５，１１５ａに含まれる。 As another example, the following configuration may be used as a topic module with a user's action as a start condition. For example, a user's action toward the robot, such as "waving at the robot 60 or the agent displayed on the display device 50" or "moving close to the robot 60 or the agent displayed on the display device 50" may be set as a start condition, and a topic module may be used in which a topic regarding a user's action, such as "controlling the action of either the robot, the agent, or the audio output device", is set as a topic when the start condition is satisfied. Examples of action control according to the topic "controlling the action of either the robot, the agent, or the audio output device" include "waving back" and "tilting the head". Note that the action control according to the topic is included in the action control information 115, 115a.

このように作成された話題モジュールを話題モジュールセット１１３に配置することで、ユーザがロボット６０や表示装置５０に表示されたエージェントに対して手を振ってきたら、話題決定部１２６は、話題モジュールセット１１３を参照し、ロボット６０や表示装置５０に表示されたエージェントに対して手を振ってきたという動作が検出部１２２により検知されると、状態更新部１２５は、検出部１２２により検知されたユーザの動作に基づいて状態情報１１２を更新する。話題決定部１２６は、話題モジュールセット１１３を参照し、「ロボット６０や表示装置５０に表示されたエージェントに対して手を振ってきた」が“Ｙ”である起動条件に対応付けられた「ロボット、エージェント又は音声出力装置のいずれかの動作を制御する」を話題として決定する。そして、動作制御部１２９，１２９ａは、動作制御情報１１５，１１５ａを参照し、決定された話題「ロボット、エージェント又は音声出力装置のいずれかの動作を制御する」に対応付けられた「手を振りかえす」という内容に基づく動作を行うようにロボット６０又はエージェントを制御する。 By arranging the topic module created in this way in the topic module set 113, when the user waves at the robot 60 or the agent displayed on the display device 50, the topic determination unit 126 refers to the topic module set 113, and when the detection unit 122 detects the action of waving at the robot 60 or the agent displayed on the display device 50, the state update unit 125 updates the state information 112 based on the user's action detected by the detection unit 122. The topic determination unit 126 refers to the topic module set 113 and determines, as the topic, "control the action of either the robot, the agent, or the audio output device" associated with the activation condition in which "the user waves at the robot 60 or the agent displayed on the display device 50" is "Y". Then, the action control unit 129, 129a refers to the action control information 115, 115a and controls the robot 60 or the agent to perform an action based on the content of "wave back" associated with the determined topic "control the action of either the robot, the agent, or the audio output device".

同様に、ユーザがロボット６０や表示装置５０に表示されたエージェントに対して顔を近づけてきたという動作が検出部１２２により検知されると、状態更新部１２５は、検出部１２２により検知されたユーザの動作に基づいて状態情報１１２を更新する。話題決定部１２６は、話題モジュールセット１１３を参照し、「ロボット６０や表示装置５０に表示されたエージェントに対して顔を近づけてきた」が“Ｙ”である起動条件に対応付けられた「ロボット、エージェント又は音声出力装置のいずれかの動作を制御する」を話題として決定する。そして、動作制御部１２９，１２９ａは、動作制御情報１１５，１１５ａを参照し、決定された話題「ロボット、エージェント又は音声出力装置のいずれかの動作を制御する」に対応付けられた「首を傾げる」という内容に基づく動作を行うようにロボット６０又はエージェントを制御する。 Similarly, when the detection unit 122 detects that the user has brought their face closer to the robot 60 or the agent displayed on the display device 50, the state update unit 125 updates the state information 112 based on the user's action detected by the detection unit 122. The topic determination unit 126 refers to the topic module set 113 and determines, as the topic, "control the action of either the robot, the agent, or the audio output device" associated with the activation condition for which "the user has brought their face closer to the robot 60 or the agent displayed on the display device 50" is "Y". Then, the action control unit 129, 129a refers to the action control information 115, 115a and controls the robot 60 or the agent to perform an action based on the content of "tilt head" associated with the determined topic "control the action of either the robot, the agent, or the audio output device".

以上のように構成されることによって、ユーザの動作に応じて話題を決定することも可能になる。そのため、様々な状況に応じた対話が可能になる。 By configuring it in this way, it is also possible to determine the topic based on the user's actions. This makes it possible to have conversations that suit a variety of situations.

＜変形例８＞
上述した実施形態では、状態情報１１２がユーザの状態を表す情報である場合を例に説明したが、状態情報１１２には、ユーザの状態だけでなく対話システム１００の状態を表す情報が含まれてもよい。ここで、対話システム１００の状態とは、対話システム１００が行った動作に応じた状態を表す。例えば、対話システム１００の状態の一例として、「ＸＸ回〇〇と発話した」や、「ＹＹ回△△の動作を行った」等が挙げられる。実際の運用時には、一人のユーザに対して対話装置１０が複数回同じ内容を発話することや複数回同じ動作を実行することも想定される。このような対話システム１００を構成する装置（例えば、対話装置１０等）が行った発話や動作の状態も加味して、ユーザに対する発話内容を決定することで、ユーザの状態だけで発話内容を決定するよりも発話の幅を広げることができる。このように構成される場合、対話装置１０の記憶部１１には、状態情報１１２としてユーザの状態と対話システム１００の状態を表す情報が記憶される。さらに、話題モジュールセット１１３を構成する話題モジュールには、対話システム１００の状態も含めた起動条件が登録される。状態更新部１２５は、ユーザの発話内容又はテキストにより入力された内容に応じてユーザの状態情報の変数を更新し、対話システム１００における発話内容又は動作に応じて対話システム１００の状態情報の変数を更新する。話題決定部１２６は、状態情報１１２と話題モジュールセット１１３とに基づいて、ユーザの状態と対話システム１００の状態に応じた話題を決定する。 <Modification 8>
In the above embodiment, the state information 112 is information that represents the state of the user. However, the state information 112 may include information that represents not only the state of the user but also the state of the dialogue system 100. Here, the state of the dialogue system 100 represents a state according to the operation performed by the dialogue system 100. For example, examples of the state of the dialogue system 100 include "uttered XX times" and "performed △△ operation YY times". In actual operation, it is assumed that the dialogue device 10 will utter the same content multiple times to one user or perform the same operation multiple times. By determining the content of the utterance to the user by taking into account the state of the utterances and operations performed by the devices (e.g., the dialogue device 10, etc.) that constitute such a dialogue system 100, the range of utterances can be broadened compared to determining the content of the utterance only based on the state of the user. In this case, the storage unit 11 of the dialogue device 10 stores information that represents the state of the user and the state of the dialogue system 100 as the state information 112. Furthermore, in the topic modules that constitute the topic module set 113, activation conditions including the state of the dialogue system 100 are registered. The state update unit 125 updates variables of the user's state information according to the user's utterance content or the content input by text, and updates variables of the state information of the dialogue system 100 according to the utterance content or the action in the dialogue system 100. The topic determination unit 126 determines a topic according to the state of the user and the state of the dialogue system 100 based on the state information 112 and the topic module set 113.

話題決定部１２６は、状態情報１１２と話題モジュールセット１１３とに基づいて、対話システム１００の状態に応じた話題を決定してもよい。対話システム１００ａにおいても、状態情報１１２には、ユーザの状態だけでなく対話システム１００の状態を表す情報が含まれてもよい。 The topic determination unit 126 may determine a topic according to the state of the dialogue system 100 based on the state information 112 and the topic module set 113. In the dialogue system 100a, the state information 112 may also include information representing not only the state of the user but also the state of the dialogue system 100.

＜変形例９＞
上述した実施形態では、起動条件が、状態情報１１２で示される各状態のいずれか２つ以上の組み合わせで構成される例を示したが、起動条件は少なくとも１つの候補状態を含んでいればよい。 <Modification 9>
In the above embodiment, an example was shown in which the start condition is configured by a combination of any two or more of the states indicated in the state information 112, but the start condition only needs to include at least one candidate state.

＜変形例１０＞
上述した実施形態では、話題モジュールセット１１３を構成する各話題モジュールに対して起動条件が対応付けられる構成を示したが、起動条件が対応付けられない話題モジュールを含む話題モジュールセット１１３があってもよい。図３３は、変形例におけるにおける話題モジュールセットの一例（その５）を示す図である。図３３に示す話題モジュールセット１１３は、図２に示す話題モジュールセット１１３に、新たに話題モジュール１１３－２０が追加された構成である。話題モジュール１１３－２０には、起動条件が設定されておらず、優先順位が最も低い位置に配置されている。そのため、話題モジュール１１３－２０は、話題モジュール１１３－２０より優先順位の高い話題モジュール１１３－１～１１３－９の全てにおいて起動条件が満たされなかった場合に実行されることになる。なお、起動条件が設定されていない話題モジュール１１３－ｎの配置位置は、どこであってもよい。 <Modification 10>
In the above embodiment, a configuration in which a start condition is associated with each topic module constituting the topic module set 113 has been shown, but a topic module set 113 may include a topic module to which no start condition is associated. FIG. 33 is a diagram showing an example (part 5) of a topic module set in the modified example. The topic module set 113 shown in FIG. 33 is a configuration in which a new topic module 113-20 is added to the topic module set 113 shown in FIG. 2. The topic module 113-20 has no start condition set, and is placed at the lowest priority position. Therefore, the topic module 113-20 is executed when the start condition is not satisfied for all of the topic modules 113-1 to 113-9 that have a higher priority than the topic module 113-20. The topic module 113-n to which no start condition is set may be placed anywhere.

図３３に示すように、起動条件が設定されていない話題モジュール１１３－ｎとして、ＡＩによる応答が設定されてもよい。例えば、ＡＩによる応答としては、ＣｈａｔＧＰＴのような入力された内容に応じて自動で文章作成を行う人工知能が用いられてもよい。このように、人工知能が、ユーザからの対話内容に応じた文章を自動で作成し、対話装置１０の音声合成部１２８が、生成された文字列に対応する音声信号を生成する。そして、音声合成部１２８により生成された音声信号は、スピーカー４０から出力される。このような構成により、シナリオで作成できていない内容であっても対話を継続させることが可能になる。 As shown in FIG. 33, a response by AI may be set as a topic module 113-n for which no activation condition is set. For example, an artificial intelligence that automatically creates sentences according to input contents, such as ChatGPT, may be used as a response by AI. In this way, the artificial intelligence automatically creates sentences according to the dialogue contents from the user, and the voice synthesis unit 128 of the dialogue device 10 generates a voice signal corresponding to the generated character string. The voice signal generated by the voice synthesis unit 128 is then output from the speaker 40. With this configuration, it becomes possible to continue the dialogue even with contents that could not be created in the scenario.

＜変形例１１＞
上述した実施形態では、対話装置１０が話題モジュールセット作成部１２０を備える構成を示した。話題モジュールセット作成部１２０は、対話装置１０とは異なる装置に実装されてもよい。このように構成される場合、話題モジュールセット作成部１２０を備える装置を話題モジュールセット作成装置としてもよい。話題モジュールセット作成装置が備える話題モジュールセット作成部１２０は、上述した方法により話題モジュールセットを作成する。話題モジュールセット作成部１２０によって作成された話題モジュールセットは、記録媒体を介して対話装置１０に保存されてもよいし、話題モジュールセット作成装置から対話装置１０に対して通信により送信されてもよい。記録媒体は、例えばＵＳＢ（Universal Serial Bus）、ＳＤカード、ハードディスク等である。 <Modification 11>
In the above-described embodiment, the dialogue device 10 is configured to include the topic module set creation unit 120. The topic module set creation unit 120 may be implemented in a device different from the dialogue device 10. In such a configuration, the device including the topic module set creation unit 120 may be referred to as a topic module set creation device. The topic module set creation unit 120 included in the topic module set creation device creates a topic module set by the above-described method. The topic module set created by the topic module set creation unit 120 may be stored in the dialogue device 10 via a recording medium, or may be transmitted from the topic module set creation device to the dialogue device 10 by communication. The recording medium is, for example, a USB (Universal Serial Bus), an SD card, a hard disk, or the like.

このように構成されることによって、対話装置１０とは異なる装置で話題モジュールセットを作成するため、実際にユーザとの対話を行う対話装置１０において話題モジュールセットを作成する必要がない。そのため、対話装置１０の処理負荷を軽減することができる。さらに、話題モジュールセットを備える話題モジュールセット作成装置では、一方向の対話の流れが記述された１以上のシナリオに基づいて、所定の条件が満たされた場合に対話の相手であるユーザに提供する話題が示された話題モジュールを複数含む話題モジュールセットを作成する。これにより、設計者は、複雑な分岐を含むシナリオＳＣを作成する必要がなく、単に一方向の対話の流れが記述された１以上のシナリオＳＣを作成すればよい。そして、話題モジュールセット作成装置は、一方向の対話の流れが記述された１以上のシナリオＳＣを用いることで容易に話題モジュールセットを作成することができる。話題モジュールセットは、起動条件を満たしたいずれかの話題による内容を出力することができるように構成されている。そのため、実際の運用上において、複雑な話題の遷移パターンを想定しきれていない場合であっても対話を継続させることが可能になる。 By configuring in this way, the topic module set is created in a device different from the dialogue device 10, so there is no need to create the topic module set in the dialogue device 10 that actually dialogues with the user. Therefore, the processing load of the dialogue device 10 can be reduced. Furthermore, in a topic module set creation device equipped with a topic module set, a topic module set including a plurality of topic modules indicating topics to be provided to a user who is the dialogue partner when a predetermined condition is satisfied is created based on one or more scenarios in which a one-way dialogue flow is described. As a result, the designer does not need to create a scenario SC including complex branches, but only needs to create one or more scenarios SC in which a one-way dialogue flow is described. Then, the topic module set creation device can easily create a topic module set by using one or more scenarios SC in which a one-way dialogue flow is described. The topic module set is configured to be able to output the contents of any topic that satisfies the start condition. Therefore, in actual operation, even if complex topic transition patterns cannot be fully anticipated, it becomes possible to continue the dialogue.

以上、この発明の実施形態について図面を参照して詳述してきたが、具体的な構成はこの実施形態に限られるものではなく、この発明の要旨を逸脱しない範囲の設計等も含まれる。 The above describes an embodiment of the present invention in detail with reference to the drawings, but the specific configuration is not limited to this embodiment and includes designs that do not deviate from the gist of the present invention.

１０、１０ａ…対話装置，２０…カメラ，３０…マイク，４０…スピーカー，５０…表示装置，６０…ロボット，１１、１１ａ…記憶部，１２、１２ａ…制御部，１２０…話題モジュールセット作成部，１２１…状態情報作成部，１２２…検出部，１２３…音声認識部，１２４…解析部，１２５…状態更新部，１２６…話題決定部，１２７…言語生成部，１２８…音声合成部，１２９、１２９ａ…動作制御部 10, 10a...Dialogue device, 20...Camera, 30...Microphone, 40...Speaker, 50...Display device, 60...Robot, 11, 11a...Memory unit, 12, 12a...Control unit, 120...Topic module set creation unit, 121...State information creation unit, 122...Detection unit, 123...Speech recognition unit, 124...Analysis unit, 125...State update unit, 126...Topic determination unit, 127...Language generation unit, 128...Speech synthesis unit, 129, 129a...Movement control unit

Claims

a topic module set creation unit that creates a topic module set including a plurality of topic modules each indicating a topic to be provided to a user who is a conversation partner when a predetermined condition is satisfied, based on one or more scenarios in which a one-way conversation flow is described;
Equipped with
At least one or more topic modules are set with an activation condition including at least one combination of a candidate state and a variable for expressing a state of the user or the system;
The topic module set creation unit creates multiple topic modules which are sets of the candidate states and the topics by matching content corresponding to the topic indicated in the topic module with each topic module, and creates the topic module set by arranging the multiple topic modules created in a hierarchical structure in a predetermined priority order, in a topic module set creation device.

The topic module set creation device according to claim 1 ;
a topic determination unit that determines a topic according to the state of the user or the state of the system based on the topic module set created by the topic module set creation device and state information that represents at least the state of the user or the system;
an output unit that outputs content according to the determined topic;
An interactive device comprising:

A scenario consists of multiple topics,
a state information creating unit that creates the state information by treating each topic constituting the scenario as a candidate state and associating a variable with each candidate state,
3. An interactive device according to claim 2 .

the state information creating unit, when creating the state information, integrates a plurality of topics having a similar meaning among the topics constituting the scenario into one topic;
4. An interactive device according to claim 3 .

A state update unit updates state information about the user in response to at least the contents of the user's speech or the contents input by text, and updates state information about the system in response to the contents of the speech or the actions in the system,
The state update unit updates the state information by updating a variable of a candidate state according to the user's utterance content or the content input by text, or the utterance content or the action in the system.
3. An interactive device according to claim 2 .

the topic determination unit determines, based on the state information, a topic with a high priority among the topics associated with the satisfied activation condition as a topic corresponding to the state of the user or the system.
3. An interactive device according to claim 2 .

the topic determination unit, when it is necessary to switch to a topic constituting a second scenario during a dialogue based on a first scenario, determines a next topic from among the topics constituting the second scenario based on the state information;
3. An interactive device according to claim 2 .

the topic determination unit, when there is no topic module that satisfies the activation condition, selects a dialogue module for which the activation condition is not set.
3. An interactive device according to claim 2 .

and an operation control unit that controls an operation of a robot connected to the device, an agent displayed on a display device, or a voice output device based on the topic or information based on the topic determined by the topic determination unit.
3. An interactive device according to claim 2 .

The computer
A topic module set is created based on one or more scenarios describing a one-way dialogue flow, the topic module set including a plurality of topic modules each indicating a topic to be provided to a user who is a dialogue partner when a predetermined condition is satisfied;
At least one or more topic modules are set with an activation condition including at least one combination of a candidate state and a variable for expressing a state of the user or the system;
A topic module set creation method, which creates multiple topic modules that are sets of the candidate states and the topics by matching content corresponding to the topic shown in the topic module with each topic module, and creates the topic module set by arranging the multiple topic modules created in a hierarchical structure with a predetermined priority .

The computer
determining a topic according to a state of the user or the system based on the topic module set created by the topic module set creation method according to claim 10 and state information representing at least a state of the user or the system;
and outputting content according to the determined topic.

Computer,
A computer program for causing the topic module set creation device according to claim 1 to function as the dialogue device according to claim 2 .