JP2000067064A

JP2000067064A - Interaction recording system

Info

Publication number: JP2000067064A
Application number: JP10233857A
Authority: JP
Inventors: Kentaro Onishi; 健太郎大西; Takaaki Habara; 貴明羽原; Hideo Inoue; 秀夫井上; Norikazu Yamagishi; 令和山岸; Mutsuharu Takesada; 睦治武貞
Original assignee: Hitachi Electronics Services Co Ltd
Current assignee: Hitachi Electronics Services Co Ltd
Priority date: 1998-08-20
Filing date: 1998-08-20
Publication date: 2000-03-03

Abstract

PROBLEM TO BE SOLVED: To provide an interaction recording system for accepting complaint or the like with a voice, and for transmitting this in a configuration to be used for the following processing to a processor. SOLUTION: This system is provided with a recording device 200 for recording interaction voice data and an information processor 100 for generating an identifier for identifying a specific part for voice data to be recorded, and allowing the recording device 200 to record this identifier. The information processor 100 accepts a request for the generation of the identifier and generates the identifier for the voice data to be recorded in the recording device 200, and the identifier is recorded in the recording device 200 so as to be made to correspond to the voice data to be recorded. Voice data 212 and identifier data 211 are recorded in the recording device 200, and text data 214 obtained by voice-recognizing the voice data by a voice recognizing part 330 are recorded.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、対話を、その内容
を後に利用可能に記録するシステムに係り、特に、苦
情、問い合せ、故障の申告、事故の通報等を音声により
受け付ける際に好適に用いることができる対話記録シス
テム、および、受付処理システムに関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a system for recording the contents of a dialogue so that it can be used later, and is particularly preferably used when receiving voices of complaints, inquiries, reports of malfunctions, reports of accidents, and the like. The present invention relates to a dialog recording system and a reception processing system.

【０００２】[0002]

【従来の技術】製品に関する問い合せ、苦情等は、電話
を介して行なわれることが一般的である。例えば、コン
ピュ−タシステムや情報通信システムで障害が発生し、
そのシステムの顧客から障害申告の電話連絡があると、
保守担当者は、納入機器、障害の状況、内容などを聞き
ながらメモをとって、障害の受け付けと、その申告内容
の確認とを行う。そして、障害申告の受け付けが終了す
ると、保守担当者が一人または複数人で、顧客先に出向
いて、受け付けた申告内容に基づいて処理を行う。2. Description of the Related Art Generally, inquiries, complaints and the like concerning products are made via telephone. For example, if a failure occurs in a computer system or information communication system,
When a customer of the system receives a phone call reporting a failure,
The maintenance technician takes notes while listening to the equipment to be delivered, the status and contents of the failure, and accepts the failure and confirms the contents of the report. Then, when the acceptance of the failure report is completed, one or more maintenance personnel go to the customer site and perform processing based on the received report content.

【０００３】ところが、近年、人員の効率的な配置、省
力化、多様な問い合せに対応する等の観点から、また、
２４時間対応の必要性等の観点から、広域のサービスエ
リアを集中的に受け持つ受付センタを設けて、苦情、問
い合せ、事故申告、障害通報等の受け付けを行うように
なりつつある。例えば、東日本に一箇所、西日本に一箇
所のように広域をカバーする少数の受付センタが設置さ
れる傾向にある。However, in recent years, from the viewpoint of efficient staffing, labor saving, and responding to various inquiries,
From the viewpoint of the necessity of 24-hour response, a reception center is provided for intensively covering a wide service area, and is receiving complaints, inquiries, accident reports, trouble reports, and the like. For example, there is a tendency that a small number of reception centers cover a wide area, such as one location in East Japan and one location in West Japan.

【０００４】[0004]

【発明が解決しようとする課題】このような集中型の受
付センタを設ける場合、受付センタでは、受け付け専門
の要員を配置し、多くの種類の通知、報知等を受け付け
てメモを作成し、それを処理担当者に伝達して、具体的
な処理は、メモを見ながら処理担当者が行うという分業
体制が採られる傾向にある。このような分業体制は、受
付を２４時間対応としても、少ない人員で効率的に対応
が可能となる利点がある。When such a centralized reception center is provided, the reception center arranges reception specialists, receives many kinds of notices and announcements, and creates memos. Is transmitted to the processing staff, and the specific processing is performed by the processing staff while looking at the memo. Such a division of labor system has the advantage that even if reception is available for 24 hours, it can be handled efficiently with a small number of personnel.

【０００５】しかし、メモは、申告者、通報者の肉声と
は異なり、情報が整理されて伝達される反面、処理担当
者に、申告者、通報者の肉声が伝わらないため、情報の
一部が失われることが起こりやすいという問題がある。[0005] However, the memo is different from the real voice of the declarant and the reporter, and the information is organized and transmitted. On the other hand, since the real voice of the declarant and the reporter is not transmitted to the processing person, part of the information is not transmitted. Is easily lost.

【０００６】これに対して、受付担当者が、通報者、申
告者等に対して問診を行い、その間の対話内容をコンピ
ュータに入力してリポートを作成することが考えられ
る。しかし、この方法では、電話等により対話しつつ、
キー入力を行う必要があり、的確に入力を行うことは必
ずしも容易ではなく、一般的とはいえない。[0006] On the other hand, it is conceivable that a receptionist asks a reporter, a filer, and the like to make a medical examination, and inputs the contents of the dialogue between them to a computer to create a report. However, in this method, while interacting by telephone or the like,
It is necessary to perform key input, and it is not always easy to perform an accurate input, and it is not common.

【０００７】また、受付時の対話内容を録音し、これを
処理担当者が聞いて、対処することも考えられる。この
場合には、受付者は、相手から必要な事項を聞き出すよ
うに問診に集中することができる。しかし、長い対話を
聞いて必要事項を取り出すことは、時間がかかり、能率
的ではない。特に、緊急を要する場合には、この点の問
題が顕著となる。It is also conceivable that the contents of the dialogue at the time of reception are recorded, and the processing staff listens to this and deals with it. In this case, the receptionist can concentrate on the medical consultation so as to retrieve necessary items from the other party. However, listening to long conversations and picking out what is needed is time consuming and inefficient. In particular, when urgency is required, the problem of this point becomes remarkable.

【０００８】本発明は、このような問題を解決すべくな
されたもので、苦情、問い合せ、故障の申告、事故の通
報等を音声により受け付けることができ、かつ、受け付
けた情報を、後の処理に利用可能な形態で処理者に伝達
することが可能な対話記録システムを提供、および、そ
れを用いた受付処理システムを提供することにある。The present invention has been made to solve such a problem, and can receive complaints, inquiries, reports of failures, reports of accidents, and the like by voice, and can process the received information in a later process. It is an object of the present invention to provide a dialogue recording system capable of transmitting a dialogue to a processor in a form usable for a user and a reception processing system using the system.

【０００９】[0009]

【課題を解決するための手段】上記目的を達成するため
本発明の第１の態様によれば、対話の少なくとも一方に
ついての内容の記録を行う対話記録システムにおいて、
対話の音声データを記録する記録装置と、記録される音
声データについて、特定の箇所を識別するための識別子
を生成して、前記記録装置に記録させる処理を行う情報
処理装置とを備え、前記情報処理装置は、前記記録装置
に記録される音声データについて、その特定箇所を識別
するための識別子の生成の要求を受け付けて、識別子を
生成し、該識別子を、記録すべき音声データと対応付け
て前記記録装置に記録する処理を行うことを特徴とする
対話記録システムが提供される。According to a first aspect of the present invention, there is provided a dialog recording system for recording the contents of at least one of dialogs.
A recording device for recording voice data of the dialogue, and an information processing device for generating an identifier for identifying a specific portion with respect to the voice data to be recorded, and performing a process of causing the recording device to record the information; The processing device receives a request for generation of an identifier for identifying a specific portion of the audio data recorded in the recording device, generates an identifier, and associates the identifier with audio data to be recorded. A dialog recording system is provided which performs a process of recording in the recording device.

【００１０】また、本発明の他の態様によれば、外部か
らの呼を受信して通話するため電話装置と、前述した対
話記録システムとを備える受付処理システムにおいて、
前記電話装置は、通話するためのマイクおよびスピーカ
とを備え、かつ、受信した音声信号と、前記マイクの音
声信号とを前記情報処理装置に入力させ、前記情報処理
装置は、入力された音声信号を前記記録装置に記録させ
ることを特徴とする受付処理システムが提供される。According to another aspect of the present invention, there is provided a reception processing system including a telephone device for receiving a call from outside and making a call, and the above-described dialogue recording system.
The telephone device includes a microphone and a speaker for talking, and causes a received audio signal and an audio signal of the microphone to be input to the information processing device, wherein the information processing device receives the input audio signal. Is recorded in the recording device.

【００１１】前記対話記録システムおよび受付処理シス
テムにおいて、さらに、次のような種々の態様を、単独
で、または、組合わせて、適宜採用することができる。In the above-mentioned dialog recording system and reception processing system, the following various aspects can be appropriately adopted singly or in combination.

【００１２】ａ）入力された音声に関する音量に基づ
いて、音声の途切れた後の音声の立ち上がりを検出する
音量検出手段と、その時点に入力される音声データを識
別するための識別子の生成要求信号を出力する手段とを
さらに備え、前記情報処理装置は、前記識別子の生成要
求信号を受けて、識別子の生成を行うこと。A) Volume detection means for detecting the rising edge of the voice after the interruption of the voice based on the volume of the input voice, and a request signal for generating an identifier for identifying the voice data input at that time Means for outputting an identifier, wherein the information processing apparatus generates the identifier in response to the identifier generation request signal.

【００１３】ｂ）識別子の生成を要求するためのキー
を備えた入力装置をさらに備え、前記情報処理装置は、
前記入力装置からの識別子の生成要求信号を受けると、
前記識別子の生成を行うこと。B) further comprising an input device provided with a key for requesting generation of an identifier;
Upon receiving an identifier generation request signal from the input device,
Generating the identifier.

【００１４】ｃ）前記入力装置からの識別子生成要求
信号には、当該識別子の位置から時間的に未来に記録さ
れている領域を識別するものと、当該識別子の位置から
時間的に過去に記録されている領域を識別するものとが
あり、前記入力装置は、それぞれについて入力できる機
能を備えること。C) An identifier generation request signal from the input device identifies an area recorded in the future in time from the position of the identifier and an identifier generation request signal in the past in time from the position of the identifier. The input device has a function of inputting each of them.

【００１５】ｄ）前記情報処理装置は、前記識別子と
して、前記識別子生成要求信号を受けた時点の時刻情報
を用いること。D) The information processing apparatus uses time information at the time of receiving the identifier generation request signal as the identifier.

【００１６】ｅ）前記情報処理装置は、前記識別子と
して、前記識別子生成要求信号を受けた時、対応する音
声データを前記記録装置に記録する際のアドレスを用い
ること。E) The information processing apparatus uses, as the identifier, an address used when the corresponding audio data is recorded in the recording device when the identifier generation request signal is received.

【００１７】ｆ）ディジタル化された音声データにつ
いて音声認識を行って文字列に変換する音声認識手段を
さらに備え、前記情報処理装置は、変換された文字列
を、元の音声データと対応して、記録装置にテキストデ
ータとして記録させること。F) The apparatus further comprises voice recognition means for performing voice recognition on the digitized voice data and converting it into a character string, wherein the information processing apparatus converts the converted character string in correspondence with the original voice data. Recording on a recording device as text data.

【００１８】ｇ）前記音声認識手段により音声から変
換された文字列について、予め定めたキーワードを含む
かを調べて、キーワードを含む場合、当該文字列の該当
部分の表示態様を変更する手段をさらに備えること。G) The character string converted from the voice by the voice recognizing means is examined to determine whether it contains a predetermined keyword. If the character string is included, a means for changing the display mode of a corresponding portion of the character string is further provided. Be prepared.

【００１９】ｈ）前記情報処理装置は、前記対話の発
言者のそれぞれの音声を、前記記録装置内で領域を分け
て記録させること。H) The information processing device records each voice of the speaker of the dialogue in the recording device in different areas.

【００２０】[0020]

【発明の実施の形態】以下、図面を参照して本発明の実
施の形態について説明する。以下の例では、苦情、問い
合せ、障害申告、故障通報等の申告者、通報者等を、代
表的に顧客Ｃとし、１箇所の受付センタＡと、一人の処
理者Ｂとを想定して説明する。もちろん、本発明はこの
形態に限定されるものではない。Embodiments of the present invention will be described below with reference to the drawings. In the following example, a complainant, an inquiry, a failure report, a failure reporter, etc., a reporter, a reporter, and the like are representatively assumed to be a customer C, and a description is given assuming one reception center A and one processor B. I do. Of course, the present invention is not limited to this mode.

【００２１】図１は、本発明の音声受付処理システムが
適用される環境の概要を示す。すなわち、図１では、ま
ず、顧客Ｃのシステムに何らかの異常が生じ、顧客Ｃの
担当者（以下、単に顧客という）が公衆網等の通信網Ｍ
を介して受付センタＡに電話し、これを受付センタＡの
受付者が問診して、その会話を録音する。そして、その
際、本発明の対話記録システムを用いた音声受付処理シ
ステムにより処理を行う。その後、音声により受け付け
た内容を、障害等の対策を行う処理者Ｂが受け取って、
必要事項を抽出し、抽出した事項に基づいて、顧客シス
テムＣについて必要な処理を実行する。これにより、音
声で受け付けられた、各種の通報、申告に対する対策が
なされることとなる。FIG. 1 shows an outline of an environment to which the voice reception processing system of the present invention is applied. That is, in FIG. 1, first, some abnormality occurs in the system of the customer C, and the person in charge of the customer C (hereinafter, simply referred to as a customer) communicates with the communication network M
A call is made to the reception center A via the Internet, and the receptionist of the reception center A makes an inquiry and records the conversation. Then, at that time, processing is performed by a voice reception processing system using the dialog recording system of the present invention. After that, the processor B who takes measures such as a failure receives the content received by voice,
The necessary items are extracted, and necessary processes are executed for the customer system C based on the extracted items. As a result, measures are taken against various reports and declarations received by voice.

【００２２】図２は、受付センタＡにおける情報の授受
の概要を示す。図２に示すように、受付センタＡには、
音声受付処理システムを構成する、顧客Ｃと電話により
対話を行うための電話装置５００と、対話記録システム
とが設置される。対話記録システムは、詳細には、図３
に示すが、少なくとも、情報処理装置１００、記録装置
２００、入力装置４１０、表示装置４２０等を備える。
電話装置５００により、顧客Ｃからの電話を受け付け
て、センタＡの受付者と対話が行われる。対話内容は、
記録装置２００に記録される。その際、情報処理装置に
より、記録内容について識別子が付される。本実施の形
態では、顧客の会話における話初めの部分（話頭部分）
に自動的に識別子を付する機能と、受付者が必要に応じ
て入力装置により識別子を付する機能とを備えている。
もちろん、いずれか一方のみとしてもよい。これによ
り、顧客の会話内容は、識別子と共に顧客音声データと
して記録装置２００に記録される。記録された顧客音声
データは、処理者に通信手段を介して送ることができ
る。FIG. 2 shows an outline of information transfer in the reception center A. As shown in FIG. 2, the reception center A includes:
A telephone device 500 for constituting a voice reception processing system for conducting a conversation with the customer C by telephone and a conversation recording system are installed. The dialogue recording system is described in detail in FIG.
As shown in FIG. 1, the apparatus includes at least the information processing apparatus 100, the recording apparatus 200, the input apparatus 410, the display apparatus 420 and the like.
The telephone device 500 accepts a telephone call from the customer C, and interacts with the receptionist at the center A. The content of the dialogue is
It is recorded on the recording device 200. At this time, the information processing apparatus gives an identifier to the recorded content. In the present embodiment, the beginning part of the conversation in the customer conversation (the beginning part)
A function of automatically assigning an identifier and a function of assigning an identifier by an input device as needed by a receptionist.
Of course, only one of them may be used. Thereby, the conversation content of the customer is recorded in the recording device 200 as customer voice data together with the identifier. The recorded customer voice data can be sent to the processor via communication means.

【００２３】図３に、本実施の形態において使用するハ
ードウエアシステムの構成の一例を示す。このシステム
では、電話装置５００と、情報処理装置１００と、記録
装置２００と、周辺機器３１０〜４３０とを備えてい
る。FIG. 3 shows an example of the configuration of a hardware system used in the present embodiment. This system includes a telephone device 500, an information processing device 100, a recording device 200, and peripheral devices 310 to 430.

【００２４】電話装置５００は、電話回線との接続を行
なうための回線制御部５１０と、送話および受話を行な
うための送受話装置５２０とを備える。この送受話装置
５２０では、受信した顧客の音声信号取り出すと共に、
受付者の音声信号を送信する。送受話装置５２０には、
受信した顧客の音声信号を音響信号に変換して出力する
スピーカ５３１と、受付者の音声を電気信号である音声
信号として送受話装置に入力するマイク５３２とが接続
される。回線制御部５１０は、呼の着信があると、これ
を呼び出し音で受付者に知らせると共に、受付者が電話
に出る、すなわち、オフフックすると、その信号を情報
処理装置１００に送る。なお、回線制御部５１０は、構
内交換機等に搭載される場合もある。Telephone device 500 includes a line control unit 510 for connecting to a telephone line, and a transmitting / receiving device 520 for transmitting and receiving. In this transmission / reception device 520, the received voice signal of the customer is taken out,
Send the voice signal of the receptionist. The transmission / reception device 520 includes:
A speaker 531 that converts the received voice signal of the customer into an acoustic signal and outputs the signal, and a microphone 532 that inputs the voice of the receiver as a voice signal as an electric signal to the transmitter / receiver are connected. When there is an incoming call, line controller 510 notifies the receiver of the incoming call with a ring tone, and sends a signal to information processing apparatus 100 when the receiver answers the telephone, that is, goes off-hook. The line control unit 510 may be mounted on a private branch exchange or the like.

【００２５】情報処理装置１００は、各種処理を実行す
る中央演算装置（ＣＰＵ）１１０と、ＣＰＵ１１０が実
行するプログラム、それに用いるデータ等を記憶するメ
モリ１２０と、インタフェース１３０とを有する。ＣＰ
Ｕ１１０は、図示していないが、メモリ等を内蔵してい
る。また、ＣＰＵ１１０は、カレンダ機能を有してい
る。これにより、後述するタイムスタンプを実現する。
なお、タイマを備えるようにしてもよい。The information processing apparatus 100 has a central processing unit (CPU) 110 for executing various processes, a memory 120 for storing programs executed by the CPU 110, data used for the programs, and an interface 130. CP
Although not shown, U110 has a built-in memory and the like. Further, the CPU 110 has a calendar function. As a result, a time stamp described later is realized.
Note that a timer may be provided.

【００２６】情報処理装置１００には、インタフェース
１３０を介して、種々の機器が接続される。すなわち、
記録装置２００、アナログディジタル変換器（Ａ／Ｄ）
３１０、アナログディジタル変換器（Ａ／Ｄ）３２０、
音声認識部３３０、音量検出部３４０、第１識別子３５
０および第３識別子自動生成部３６０が、情報処理装置
１００に内蔵される形で接続される。もちろん、これら
の機器は、それぞれ外付け装置としてもよい。また、入
力装置４１０、表示装置４２０、通信制御装置４３０が
接続される。Various devices are connected to the information processing apparatus 100 via an interface 130. That is,
Recording device 200, analog / digital converter (A / D)
310, an analog-to-digital converter (A / D) 320,
Voice recognition section 330, volume detection section 340, first identifier 35
0 and the third identifier automatic generation unit 360 are connected in a form incorporated in the information processing apparatus 100. Of course, each of these devices may be an external device. Further, the input device 410, the display device 420, and the communication control device 430 are connected.

【００２７】記録装置２００は、例えば、ハードディス
ク装置で構成される。この記録装置２００には、電話を
かけてきた顧客ごとに顧客音声ファイル２１０と、顧客
の住所、電話番号、各種属性データ等を含む顧客データ
２９０とが設けられる。顧客音声ファイル２１０には、
ディジタル化された顧客の音声信号である顧客音声デー
タ２１２と、ディジタル化された受付者の音声信号であ
る受付者音声データ２１３と、音声認識して得られたテ
キストデータ２１４と、後述する識別子データ２１１と
が記録される。The recording device 200 is composed of, for example, a hard disk device. The recording device 200 is provided with a customer voice file 210 for each customer who makes a call, and customer data 290 including the customer's address, telephone number, various attribute data, and the like. The customer voice file 210 includes
Customer voice data 212 which is a digitized voice signal of the customer, receiver voice data 213 which is a digitized voice signal of the receiver, text data 214 obtained by voice recognition, and identifier data to be described later. 211 is recorded.

【００２８】なお、本実施の形態および後述する他の実
施の形態では、識別子データ２１１、顧客音声データ２
１２、受付者データ２１３およびテキストデータ２１４
を同じファイルに格納している。しかし、本発明は、こ
れに限られない。一部または全部のデータを、顧客毎に
まとめずに、それ自体独立した形で格納するようにして
もよい。例えば、識別子データ２１１を顧客データ２９
０のようにそれ自体で格納するようにしてもよい。ま
た、識別子データ２１１、顧客音声データ２１２、受付
者データ２１３およびテキストデータ２１４をすべて独
立に格納するようにしてもよい。In the present embodiment and other embodiments described later, the identifier data 211 and the customer voice data 2
12. Recipient data 213 and text data 214
Is stored in the same file. However, the present invention is not limited to this. Some or all of the data may be stored in an independent form without being collected for each customer. For example, the identifier data 211 is stored in the customer data 29
You may make it store by itself like 0. Further, the identifier data 211, the customer voice data 212, the acceptor data 213, and the text data 214 may all be stored independently.

【００２９】Ａ／Ｄ３１０は、顧客の音声信号をディジ
タル信号に変換する。Ａ／Ｄ３２０は、受付者の音声信
号をディジタル信号に変換する。Ａ／Ｄ３１０およびＡ
／Ｄ３２０は、共に出力バッファ（図示せず）を有し、
ディジタル化された音声信号を一時的に保持する。ま
た、Ａ／Ｄ３１０およびＡ／Ｄ３２０は、情報出力装置
１００に対して、音声データの記録要求信号を出力し、
受け入れが許容されると、前記出力バッファに記憶され
るディジタル化された音声信号を情報処理装置１００に
送る。送られた音声データは、それぞれインタフェース
１３０を介して情報処理装置１００に入力され、その
後、記録装置２００における当該顧客音声データファイ
ル２１０に記録される。音声データファイルに記録され
た音声データは、これを読み出して、ディジタルアナロ
グ変換器によりアナログ信号に変換することで、音声と
して再生することができる。なお、記録時に圧縮しても
よい。なお、Ａ／Ｄ３２０は、図示していないが、マイ
ク５３２からの入力を増幅するための増幅器を備えてい
る。The A / D 310 converts a customer's voice signal into a digital signal. The A / D 320 converts the voice signal of the recipient into a digital signal. A / D310 and A
/ D320 both have an output buffer (not shown),
The digitized audio signal is temporarily held. The A / D 310 and the A / D 320 output a recording request signal for audio data to the information output device 100,
When the acceptance is permitted, the digitized audio signal stored in the output buffer is sent to the information processing device 100. The sent voice data is input to the information processing apparatus 100 via the interface 130, and then recorded in the customer voice data file 210 in the recording device 200. The audio data recorded in the audio data file is read out and converted into an analog signal by a digital-to-analog converter, so that it can be reproduced as audio. Note that the data may be compressed during recording. The A / D 320 includes an amplifier (not shown) for amplifying an input from the microphone 532.

【００３０】また、電話回線がディジタル回線で、ディ
ジタルデータで音声が伝送される場合には、Ａ／Ｄ３１
０は、省略することができる。ただし、情報処理装置１
００に入力するためのデータを一時的に保持するための
バッファメモリを設けることが好ましい。When the telephone line is a digital line and voice is transmitted as digital data, the A / D 31
0 can be omitted. However, the information processing device 1
It is preferable to provide a buffer memory for temporarily holding data to be input to 00.

【００３１】音声認識部３３０は、ディジタル化された
音声信号を読み込んで、その音声を対応する文字ないし
文字列に変換して、テキストデータを生成する。本実施
の形態では、独立した装置として接続されている。もち
ろん、ＣＰＵ１１０によって処理するようにしてもよ
い。本実施の形態では、音声認識部３００として、不特
定話者対応のものを用いる。The voice recognition section 330 reads a digitized voice signal, converts the voice into a corresponding character or character string, and generates text data. In the present embodiment, they are connected as independent devices. Of course, you may make it process by CPU110. In the present embodiment, a speech recognition unit 300 that supports an unspecified speaker is used.

【００３２】音量検出部３４０は、音声が途切れた後に
レベルが急激に増加する点を検出する機能を有する。す
なわち、この音量検出部３４０は、顧客の音声信号のレ
ベルを検出して、予め定めた基準値以上かを判定し、基
準値未満のときは、発言が途切れていると判定し、その
後、基準値を超える状態となったとき、発言が始まった
と判定して、頭出し信号を出力する。図４は、その一例
を示す。例えば、時刻ｔ４の前では、しばらくの間、音
量レベルが低い状態が続き（無音状態）、時刻ｔ４にお
いて、急激に音量レベルが増大している。音量検出部３
４０は、この状態を検出して、先頭信号を出力する。こ
の先頭信号は、第３識別子自動生成部３６０に入力され
る。なお、本実施の形態では、顧客の音声についてのみ
検出するようにしているが、受付者についても音量検出
するようにしてもよい。The volume detector 340 has a function of detecting a point where the level sharply increases after the sound is interrupted. That is, the sound volume detection unit 340 detects the level of the voice signal of the customer, determines whether the level is equal to or higher than a predetermined reference value, and determines that the speech is interrupted when the level is lower than the predetermined reference value. When the value exceeds the value, it is determined that the speech has begun, and a cueing signal is output. FIG. 4 shows an example. For example, before time t4, the volume level remains low for a while (silence state), and at time t4, the volume level sharply increases. Volume detector 3
40 detects this state and outputs the head signal. This head signal is input to the third identifier automatic generation unit 360. In the present embodiment, only the voice of the customer is detected, but the volume of the receiver may also be detected.

【００３３】第１識別子自動生成部３５０は、前記音量
検出部３４０の出力信号を受けて、その時点における第
１識別子を生成する。この識別子については後述する。
生成した識別子は、それぞれ対応する顧客音声データフ
ァイル２１０に格納される。The first identifier automatic generator 350 receives the output signal of the volume detector 340 and generates the first identifier at that time. This identifier will be described later.
The generated identifiers are stored in the corresponding customer voice data files 210, respectively.

【００３４】第３識別子生成部３６０は、音声認識部３
３０においてテキストデータに変換された音声テキスト
データについて、予め設定してあるキーワードを含むか
を判定して、含む場合に、第３識別子を生成する。[0034] The third identifier generation unit 360 is provided with the speech recognition unit 3.
It is determined whether the voice text data converted into the text data in 30 includes a preset keyword, and if so, a third identifier is generated.

【００３５】入力装置４１０は、例えば、キーボードで
構成される。図３には、キーボードのみが示されている
が、この他に、マウス、タッチパネル等を備えることが
できる。このキーボードには、第２識別子の入力を行な
うキーが定義される。このキーは、キーボードの特定の
キーを割り当てる。例えば、“＞”のキーを、そのキー
が押下された直後から時間的に未来の領域に記録されて
いる内容に注意を払うべきことを示すキー（以下フォワ
ードキーという）として定義し、“＜”をそのキーが押
下される直前から過去の領域に記録されている内容に注
意を払うべきことを示すキー（リバースキーという）と
して定義することができる。また、後述する表示装置の
表示画面にシンボルを表示し、これをマウス等でクリッ
クすることで、識別子の入力ができるようにしてもよ
い。また、前述したキー入力の前後に、メッセージを入
力することができるようにしてもよい。特に、顧客の話
の後に、識別子を付する場合、内容が特定できるので、
内容を表わす特別のメッセージ、コード等を付加するこ
とにより、後の検索を容易にすることが可能となる。な
お、“＞”および“＜”は、識別子の属性を示すもので
ある。識別されるデータの位置を表すための、後述する
タイムスタンプ、アドレス等と共に記録される。The input device 410 comprises, for example, a keyboard. Although only a keyboard is shown in FIG. 3, a mouse, a touch panel, and the like may be provided in addition to the keyboard. A key for inputting the second identifier is defined on this keyboard. This key assigns a specific key on the keyboard. For example, a key “>” is defined as a key (hereinafter referred to as a forward key) indicating that attention should be paid to contents recorded in a temporally future area immediately after the key is pressed, and “<"Can be defined as a key (referred to as reverse key) indicating that attention should be paid to the contents recorded in the past area immediately before the key is pressed. Alternatively, a symbol may be displayed on a display screen of a display device described later, and the identifier may be input by clicking the symbol with a mouse or the like. Further, a message may be input before and after the key input described above. In particular, if you add an identifier after the story of the customer, the content can be specified,
By adding a special message, code, or the like representing the content, it is possible to facilitate later retrieval. Note that “>” and “<” indicate attributes of the identifier. It is recorded together with a later-described time stamp, address, and the like for indicating the position of the identified data.

【００３６】表示装置４２０は、前述した操作に必要な
画面、受付者のための問診ガイド画面、顧客データ表示
画面、音声認識結果のデータを表示して、認識結果を確
認するための画面、変換されたテキストデータを表示す
るための画面等、種々の画面の表示に用いられる。The display device 420 displays the screens necessary for the above-described operations, an inquiry guide screen for the receptionist, a customer data display screen, a voice recognition result data, and a screen for confirming the recognition result. It is used for displaying various screens such as a screen for displaying the text data obtained.

【００３７】通信制御装置４３０は、情報処理装置によ
りデータ通信等を行うための装置である。この通信制御
装置４３０と、通信網Ｍと、処理者の通信制御装置９３
０と、処理者の情報処理装置８００とにより、電話顧客
音声データファイル等を処理者に転送する通信を行うこ
とに用いることができる。また、処理者からのデータを
受信することもできる。処理者の通信制御装置９３０を
携帯電話とすると共に、情報処理装置８００を携帯端末
とすることで、処理者が現場で、必要な情報を取得可能
とすることができる。The communication control device 430 is a device for performing data communication and the like by the information processing device. The communication control device 430, the communication network M, and the processor's communication control device 93
0 and the processor's information processing device 800 can be used to perform communication for transferring a telephone customer voice data file or the like to the processor. Also, data from the processor can be received. When the processor's communication control device 930 is a mobile phone and the information processing device 800 is a mobile terminal, the processor can obtain necessary information on site.

【００３８】次に、上述した識別子について説明する。
識別子は、対話内容を後に参照する際に、目的情報を得
ることが容易となるようにするためのものである。した
がって、第１に、どの位置に情報があるかを示す機能を
果たす識別子と、第２に、特に参照すべき事項が存在す
ることを示す機能を果たす識別子と、第３に、特定の内
容の存在を示す機能を果たす識別子がある。Next, the above-mentioned identifier will be described.
The identifier is for making it easy to obtain the target information when referring to the contents of the dialog later. Therefore, first, an identifier that functions to indicate where information is located, second, an identifier that functions to indicate that there is a matter to be particularly referred to, and third, a specific content. There are identifiers that serve to indicate presence.

【００３９】位置を示す第１の識別子としては、前述し
た先頭部分を示す識別が代表的なものである。この場合
には、内容の如何によらず付されるので、自動的に付す
るようにすることが容易である。As the first identifier indicating the position, the above-described identification indicating the head portion is typical. In this case, since it is attached regardless of the content, it is easy to automatically attach it.

【００４０】内容について参照すべきこと示す第２の識
別子は、例えば、特定の事項に関する質問を行って回答
を得る場合のように、次に話される事項が予測できる場
合、既に話された結果、内容が分かっている場合等のよ
うに、話者の会話内容がある程度、予測乃至特定可能で
あるときに付するものである。従って、受付者が手動に
より生成タイミングを決定する場合に適している。もち
ろん、問診のような一問一答式の会話の場合には、識別
子を自動的に生成することも容易であろう。The second identifier indicating what the content should be referred to is, for example, when a question about a specific matter is obtained and an answer is obtained, and when the next matter to be spoken can be predicted, the result already spoken is obtained. This is added when the conversation content of the speaker can be predicted or specified to some extent, such as when the content is known. Therefore, it is suitable when the receptionist manually determines the generation timing. Of course, in the case of a question-and-answer conversation such as an interview, it would be easy to automatically generate the identifier.

【００４１】特定の内容が存在することを示す第３の識
別子は、予め特定のキーワードを設定しておき、会話内
に特定されてキーワードが存在する場合に、そのキーワ
ードと一致する文字列の表示態様を変更する。例えば、
表示色を変更する。As the third identifier indicating that the specific content exists, a specific keyword is set in advance, and when the keyword is specified in the conversation, a character string matching the keyword is displayed. Change the mode. For example,
Change the display color.

【００４２】次に、識別子を具体的にどのようにして生
成し、対象のデータの該当箇所（位置）を特定するのか
について説明する。Next, a description will be given of how an identifier is specifically generated and a corresponding portion (position) of target data is specified.

【００４３】第１の手法は、タイムスタンプ方式であ
る。これは、識別子を、それを生成する時点を示す時刻
情報として生成するものである。すなわち、特定の事象
が発生すると、その時点を示す時刻情報を生成する方
法、特定起算点からの経過時間を用いて、事象の発生時
点を経過時間を特定して、恰も時刻情報をスタンプする
かのように記録する方法等がある。時刻情報は、例え
ば、ＣＰＵ１１０において生成されるカレンダ情報を利
用することができる。また、タイマを設定して、経過時
間を示す時間情報を取得するようにしてもよい。The first method is a time stamp method. In this method, an identifier is generated as time information indicating a time when the identifier is generated. That is, when a specific event occurs, a method of generating time information indicating the time point, using the elapsed time from a specific starting point, specifying the elapsed time of the event occurrence, and stamping the time information as if And the like. As the time information, for example, calendar information generated in the CPU 110 can be used. Further, a timer may be set to obtain time information indicating the elapsed time.

【００４４】時刻情報は、例えば、図４に示すように、
時刻ｔ２において、入力装置４１０からフォワードキー
の入力指示がなされると、ＣＰＵ１１０は、その時点に
ついてタイムスタンプを生成する。また、時刻ｔ３にお
いて、入力装置４１０からリバースキーの入力がなされ
ると、ＣＰＵ１１０は、その時点についてタイムスタン
プを生成する。また、一定時間、音量レベルが低い状態
が続いた後、時刻ｔ４において、音量レベルが上がった
状態を音量検出部３４０が検出すると、これを受けて、
第１識別子自動生成部３５０が時刻ｔ４に第１識別子自
動生成信号を出力する。この信号は、識別子の生成要求
信号であり、かつ、フォワードキーの属性を有する。The time information is, for example, as shown in FIG.
At time t2, when an input instruction of the forward key is made from input device 410, CPU 110 generates a time stamp for that time. At time t3, when a reverse key is input from input device 410, CPU 110 generates a time stamp for that time. Further, after the state in which the volume level is low for a certain period of time, at time t4, when the volume detection unit 340 detects that the volume level has increased, in response to this,
First identifier automatic generation section 350 outputs a first identifier automatic generation signal at time t4. This signal is an identifier generation request signal and has the attribute of a forward key.

【００４５】第２の手法は、格納アドレスを示す情報に
よって識別する手法である。これは、識別子を、それが
生成された時点に格納される情報の格納アドレスを示す
アドレス情報として生成するものである。例えば、音声
データが記録装置２００に順次格納されている際の、あ
る時点で、識別子生成要求があると、ＣＰＵ１１０に内
蔵されるアドレスカウンタの、その時点の格納アドレス
を取得して、これを識別子とするものである。The second method is a method of identifying the information based on the information indicating the storage address. This is to generate an identifier as address information indicating a storage address of information stored at the time when the identifier is generated. For example, at a certain point in time when audio data is sequentially stored in the recording device 200, if there is an identifier generation request, a storage address at that time of an address counter built in the CPU 110 is acquired, and this is identified as an identifier. It is assumed that.

【００４６】アドレス情報は、例えば、図４に示すよう
に、時刻ｔ２において、入力装置４１０からフォワード
キーの入力指示がなされると、ＣＰＵ１１０は、その時
点に、音声データが格納される、記録装置２００のアド
レスを識別子として生成する。また、時刻ｔ３におい
て、入力装置４１０からリバースキーの入力がなされる
と、同様に、ＣＰＵ１１０は、その時点に、音声データ
が格納される、記録装置２００のアドレスを識別子とし
て生成する。また、一定時間、音量レベルが低い状態が
続いた後、時刻ｔ４において、音量レベルが上がった状
態を音量検出部３４０が検出すると、これを受けて、第
１識別子自動生成部３５０が時刻ｔ４に第１識別子自動
生成信号を出力する。この信号は、識別子の生成要求信
号であり、かつ、フォワードキーの属性を有する。As shown in FIG. 4, for example, as shown in FIG. 4, at time t2, when an input instruction of the forward key is made from the input device 410, the CPU 110 stores the audio data at that time. 200 is generated as an identifier. At time t3, when a reverse key is input from the input device 410, the CPU 110 similarly generates, as an identifier, the address of the recording device 200 in which the audio data is stored at that time. Further, after the volume level has been kept low for a certain period of time, at time t4, when the volume level detection unit 340 detects that the volume level has increased, the first identifier automatic generation unit 350 receives this at time t4. It outputs a first identifier automatic generation signal. This signal is an identifier generation request signal and has the attribute of a forward key.

【００４７】第３の手法は、キーワード方式である。こ
れは、当該キーワードと、これを含む音声データとを対
応付けるもので、具体的には、キーワードと対応する音
声データの格納アドレスとを対応付ける。すなわち、入
力された音声データ中に、設定してあるキーワードが含
まれる場合、当該音声データの格納アドレスを前記キー
ワードに対応付けて、前記キーワードを識別子データ２
１１に格納する。The third method is a keyword method. This associates the keyword with audio data containing the keyword, and specifically associates the keyword with a storage address of audio data corresponding to the keyword. That is, when the input voice data includes a set keyword, the storage address of the voice data is associated with the keyword, and the keyword is stored in the identifier data 2.
11 is stored.

【００４８】以上のようにして生成される第１識別子お
よび第２識別子は、対応する顧客音声データファイル２
１０に識別子データ２１１として格納される。なお、タ
イムスタンプの場合、音声データの格納開始時刻および
格納終了時刻と、音声データのデータ量、または、先頭
アドレスおよび後尾アドレスとを用いて、各タイムスタ
ンプが示すデータの格納アドレスを求めることで、対応
する記録内容の格納位置を知ることができる。識別子と
してアドレスを用いている場合には、そのままそのアド
レスを利用する。さらに、タイムスタンプと、アドレス
とを併用することもできる。この場合には、タイムスタ
ンプおよびアドレスのいずれによっても、検索が可能と
なる。The first identifier and the second identifier generated as described above correspond to the corresponding customer voice data file 2
10 is stored as identifier data 211. In the case of the time stamp, the storage address of the data indicated by each time stamp is obtained by using the storage start time and the storage end time of the audio data, the data amount of the audio data, or the start address and the tail address. , The storage position of the corresponding recorded content can be known. When an address is used as an identifier, the address is used as it is. Further, the time stamp and the address can be used together. In this case, the search can be performed by using both the time stamp and the address.

【００４９】ところで、いずれの場合にも、識別子を生
成した時点と、対応する音声データを格納した時点とが
正確に一致するとは限らないため、ずれが生ずる場合が
ありうる。しかし、検索すべき内容は、一瞬のデータで
はなく、ある程度の時間継続される会話内容であるた
め、多少のずれがあっても、実用上、支障はない。In any case, since the time when the identifier is generated and the time when the corresponding audio data is stored do not always match exactly, a shift may occur. However, the content to be searched is not instantaneous data, but content of conversation that is continued for a certain period of time. Therefore, even if there is some deviation, there is no problem in practical use.

【００５０】なお、タイムスタンプは、複数話者、すな
わち、対話の当事者の発言順を整理することにも使用す
ることができる。一般に、電話での対話では、対話の当
事者の発言が重複していることが多々ある。このような
重複した状態で対話がなされている場合に、これを後に
再生すると、話者毎の発言内容が区別しにくい。そのた
め、本発明では、各話者の発言の先頭部分をタイムスタ
ンプを用いて検索し、各話者の一まとまりの発言を、重
複しないように、時系列に整理して、再生することがで
きる。具体的には、例えば、読み出し順テーブルを作成
し、該テーブルに、タイムスタンプの順にしたがって、
各話者の一まとまりの発言を登録しておく。そして、読
出時に、読出順テーブルにしたがって、順次読み出す。
これにより、重複した発言を分離して、対話の流れを整
理することができる。なお、これは、音声データの記録
のみならず、顧客テキスト２１４および受付者テキスト
データ２１５についても同様に、整理することができ
る。It should be noted that the time stamp can also be used to arrange the order of speech of a plurality of speakers, that is, the parties involved in the dialogue. In general, in a telephone conversation, there are many cases where statements of parties involved in the conversation are duplicated. When the dialogue is performed in such an overlapping state, if the dialogue is reproduced later, it is difficult to distinguish the utterance contents of each speaker. Therefore, in the present invention, the head part of each speaker's utterance can be searched using the time stamp, and a group of utterances of each speaker can be arranged and reproduced in a time series so as not to overlap. . Specifically, for example, a reading order table is created, and in this table, according to the order of the time stamp,
A group of remarks of each speaker is registered. Then, at the time of reading, the data is sequentially read according to the reading order table.
As a result, it is possible to separate redundant statements and to arrange the flow of the dialogue. It should be noted that, in addition to the recording of voice data, the customer text 214 and the recipient text data 215 can be similarly organized.

【００５１】次に、第３の識別子の生成について説明す
る。この第３識別子は、第１識別子および第２識別子と
は異なり、音声データについて生成するものではなく、
テキストデータについて生成する。すなわち、音声認識
部３１０で変換されたテキストデータについて、予め設
定してあるキーワードの有無を判定して、該当するキー
ワードがあると、この第３識別子が生成される。このた
め、この識別子の生成は、一旦、録音した後に、行うよ
うにしてもよい。Next, generation of the third identifier will be described. The third identifier is different from the first identifier and the second identifier and is not generated for audio data.
Generate for text data. That is, the presence / absence of a preset keyword is determined for the text data converted by the voice recognition unit 310, and if there is a corresponding keyword, the third identifier is generated. For this reason, the generation of the identifier may be performed once after recording.

【００５２】次に、本実施の形態の動作について、上記
各図の他、図５のフローチャートを参照して説明する。Next, the operation of this embodiment will be described with reference to the flowcharts of FIGS.

【００５３】本実施の形態では、情報処理装置による処
理に先立ち、次の処理が行われる。まず、回線制御部５
１０は、呼の着信があると、図示しない音響装置を鳴動
させて、電話の着信を受付者に知らせる。受付者がオフ
フックすると、送受話装置５２０を介して、通話可能と
し、スピーカ５３１およびマイク５３２とが送受話装置
を介して回線に接続される。また、受付者がオフフック
すると、オフフック信号を情報処理装置１０に送る。ま
た、通信が途切れたときも、それを示す信号をＣＰＵ１
１０に送る。In this embodiment, the following processing is performed prior to the processing by the information processing apparatus. First, the line control unit 5
When there is an incoming call, the sound device 10 sounds a sound device (not shown) to notify the receptionist of the incoming call. When the acceptor goes off-hook, communication is enabled via the transmission / reception device 520, and the speaker 531 and the microphone 532 are connected to the line via the transmission / reception device. When the receiver goes off-hook, an off-hook signal is sent to the information processing device 10. Also, when communication is interrupted, a signal indicating the interruption is sent to CPU 1.
Send to 10.

【００５４】情報処理装置１１０は、オフフック信号の
入力を監視し（ステップ１００１）、着信があると、表
示装置４２０に、顧客の名称、電話番号、住所、担当者
名等の顧客を特定する事項を問診する旨を表示画面に表
示して、受付者に注意を喚起する（ステップ１００
２）。なお、この段階で、相手方の電話番号を捕捉する
ことができるシステムの場合には、捕捉した電話番号を
ＣＰＵ１１０に送って、これを前記表示画面上に表示さ
せ、確認を求める。The information processing apparatus 110 monitors the input of the off-hook signal (step 1001), and when there is an incoming call, the display device 420 specifies items such as the customer's name, telephone number, address, person in charge, etc. Is displayed on the display screen to alert the receptionist (step 100).
2). At this stage, in the case of a system that can capture the telephone number of the other party, the captured telephone number is sent to the CPU 110 and displayed on the display screen to request confirmation.

【００５５】次に、受付者が着信した電話により、顧客
に対して、画面に表示されている、顧客を特定する事項
についての問診結果の入力を受け付け、顧客音声データ
ファイル２１０を新規に作成する（ステップ１００
３）。例えば、顧客名、担当部署名、担当者名、電話番
号、住所、顧客コード等についての入力を受け付ける。
これらの入力は、入力装置４１０により受け付ける。顧
客特定データは、非常に重要なデータであるため、本実
施の形態では、手入力で入力を受け付ける形態を取って
いるが、もちろん、これに限定されない。例えば、この
段階から、音声データとして記録するようにしてもよ
い。前記顧客音声データファイル２１０は、取り敢えず
は、ＣＰＵ１１０の作業用メモリ上に設けられる。それ
がある程度の両になったとき、および、格納が終了した
とき、記憶装置２００に格納される。Next, the receptionist accepts the input of the result of the inquiry about the item for specifying the customer displayed on the screen by the incoming call to the customer, and newly creates the customer voice data file 210. (Step 100
3). For example, an input about a customer name, a department name, a person in charge, a telephone number, an address, a customer code, and the like is received.
These inputs are received by the input device 410. Since the customer specifying data is very important data, in the present embodiment, a form in which the input is manually input is adopted, but, of course, the present invention is not limited to this. For example, from this stage, it may be recorded as audio data. The customer voice data file 210 is provided on the working memory of the CPU 110 for the moment. The data is stored in the storage device 200 when it becomes both to some extent and when the storage is completed.

【００５６】また、前記した問診内容のうち、一部の情
報の入力を受け付けると、顧客データファイルを検索し
て、該当する顧客の候補の有無を検索し、候補が索出さ
れた場合、それを、表示装置４２０の表示画面上に表示
するようにしてもよい。このようにすると、表示画面上
に表示されている事項について、確認するのみで良く、
受付者の入力の負担を軽減することができる。Further, when the input of a part of the information of the above-mentioned medical inquiry is received, the customer data file is searched for the presence or absence of the candidate of the customer concerned. May be displayed on the display screen of the display device 420. By doing so, it is only necessary to check the items displayed on the display screen,
It is possible to reduce the input burden of the receptionist.

【００５７】次に、必要な特定事項の入力が終了して、
受付者がスタート指示を入力装置４１０を介して入力す
ると、それを受け付けて、音声入力の処理を開始する
（ステップ１００４）。すなわち、Ａ／Ｄ３１０および
Ａ／Ｄ３２０のいずれかからの音声データ入力要求を待
つ（ステップ１００５）。Next, after input of necessary specific items is completed,
When the acceptor inputs a start instruction via the input device 410, the acceptor receives the instruction and starts a voice input process (step 1004). That is, it waits for a voice data input request from either A / D 310 or A / D 320 (step 1005).

【００５８】音声データ入力要求があると、それが、Ａ
／Ｄ３１０およびＡ／Ｄ３２０のいずれからの入力要求
かを判定し、顧客ではない場合には、受付者者音声デー
タ２１３として前記音声データファイル２１０に格納す
ることを指示する（ステップ１００７）。一方、顧客の
音声の場合、顧客音声データ２１２として前記顧客音声
データファイル２１０に格納することを指示する（ステ
ップ１００８）。ここで、入力される音声が受付者のも
のか顧客のものかの区別は、本実施の形態では、送受話
装置において、送信側と受信側とに分離されている状態
で、受信信号及び送信信号を取り出すことにより、受付
者の音声か、顧客の音声かを区別している。もちろん、
顧客と受付者との区別は、これに限られない。例えば、
音声のレベルの相違を利用して、顧客と受付者とを分離
すること、周波数帯域の広さの違いを利用して、顧客と
受付者とを分離すること等が可能である。When there is a voice data input request,
It is determined which of the input request is / D310 and A / D320, and if it is not a customer, it is instructed to store it as voice data 213 in the voice data file 210 (step 1007). On the other hand, in the case of the customer's voice, it is instructed to store it as the customer voice data 212 in the customer voice data file 210 (step 1008). Here, the distinction between the input voice and the voice of the receiver is made in the present embodiment in a state where the transmitting and receiving device is separated into the transmitting side and the receiving side, and the received signal and the transmitting side are separated. By extracting the signal, it is distinguished between the voice of the receptionist and the voice of the customer. of course,
The distinction between the customer and the receptionist is not limited to this. For example,
It is possible to separate the customer and the receiver by using the difference in the voice level, and to separate the customer and the receiver by using the difference in the width of the frequency band.

【００５９】この後、音声データの入力が終了したかを
調べる（ステップ１００９）。具体的には、オフフック
信号の切断、入力装置４１０から終了指示当の入力の有
無等を調べる。これらが入力していないときは、入力が
継続しているものとする。Thereafter, it is checked whether the input of the voice data has been completed (step 1009). Specifically, it is checked whether the off-hook signal has been disconnected, whether or not the input device 410 has input an end instruction, and the like. When these are not input, it is assumed that the input is continued.

【００６０】また、識別子生成要求があるかを調べる
（ステップ１０１０）。これは、第１識別子自動生成部
３５０からの識別子入力、および、入力装置４１０から
の識別子（第２識別子）生成要求の有無を調べる。It is checked whether there is an identifier generation request (step 1010). This is done by checking the identifier input from the first identifier automatic generation unit 350 and the presence or absence of an identifier (second identifier) generation request from the input device 410.

【００６１】第１識別子自動生成部３５０からの識別子
生成要求信号を検出すると、ＣＰＵ１１０は、その時点
の時刻（年月日を含む）を、内蔵するカレンダから取得
し、さらに、その時点で、記録を行うべき音声データの
格納先アドレスを取得して、これらを対応付けて、識別
子の予め設定された属性、この場合には、フォワードキ
ーを示す属性情報と共に、当該顧客音声データファイル
２１０に識別子データ２１１として格納する（ステップ
１０１１）。すなわち、この実施の形態では、識別子と
して、タイムスタンプとアドレスとを併用している。Upon detecting the identifier generation request signal from the first automatic identifier generation unit 350, the CPU 110 obtains the time (including the date) from the built-in calendar, and further records the time at that time. The storage destination addresses of the audio data to be performed are obtained, and the addresses are associated with each other, and the identifier data is stored in the customer audio data file 210 together with the preset attribute of the identifier, in this case, the attribute information indicating the forward key. It is stored as 211 (step 1011). That is, in this embodiment, a time stamp and an address are used together as an identifier.

【００６２】入力装置４１０から識別子生成要求がある
と、上記と同様にして、内蔵するカレンダから取得し、
さらに、その時点で、記録を行うべき音声データの格納
先アドレスを取得して、これらを対応付けて、指定され
た識別子の属性、フォワードキーまたはリバースキーを
示す属性情報と共に、当該顧客音声データファイル２１
０に識別子データ２１１として格納する（ステップ１０
１１）。When an identifier generation request is received from the input device 410, the identifier generation request is acquired from the built-in calendar in the same manner as described above.
Further, at that time, the storage destination address of the audio data to be recorded is obtained, and these are associated with each other, together with the attribute of the specified identifier, the attribute information indicating the forward key or the reverse key, and the customer audio data file. 21
0 is stored as identifier data 211 (step 10).
11).

【００６３】次に、上述したステップ１００９と同様
に、音声データの入力が終了したかを調べる（ステップ
１０１２）。終了していない場合、ステップ１００６に
戻る。一方、終了の場合（ステップ１００９の場合も含
む）、当該顧客からの電話による苦情との受け付けを終
了する旨の画面を表示する（ステップ１０１３）。Next, similarly to step 1009 described above, it is determined whether the input of the voice data has been completed (step 1012). If not, the process returns to step 1006. On the other hand, in the case of termination (including the case of step 1009), a screen indicating that acceptance of the complaint from the customer by telephone is terminated is displayed (step 1013).

【００６４】この際、当該音声データファイル２１０
を、処理者Ｂに転送するかの質問を併せて表示し、転送
先の指定、および、転送するとの指示を受け付けると、
ＣＰＵ１１０は、通信制御装置４３０から通信網Ｍを介
して当該音声データファイル２１０を処理者Ｂに送信す
る。なお、音声データファイル２１０を転送せず、処理
者Ｂからの照会に応じて、内容を送るようにしてもよ
い。At this time, the audio data file 210
Is displayed together with a question as to whether or not to transfer to the processor B, and when a transfer destination is specified and an instruction to transfer is received,
The CPU 110 transmits the audio data file 210 to the processor B from the communication control device 430 via the communication network M. Note that the content may be sent in response to an inquiry from the processor B without transferring the audio data file 210.

【００６５】次に、記録装置２００に記録した音声デー
タについて、テキストデータに変換する必要がある場合
には、入力装置４１０からの指示を受けると、ＣＰＵ１
１０は、記録装置２００に格納されている音声データフ
ァイル２１０から、指示された顧客についての音声デー
タファイル２１０に記録されている顧客音声データ２１
２を、音声認識部３３０に、処理を行う長さ分ずつ転送
する。音声認識部３３０では、転送された音声データに
ついて、予め用意した音声パターン辞書を用いて認識処
理し、認識結果を文字コードに変換する。そして、転送
された部分についての文字列を生成する。生成した文字
列を、ＣＰＵ１１０により、音声データファイル２１０
にテキストデータ２１４として格納する。この際、表示
装置４２０に表示して、受付者に変換状態を確認させる
と共に、入力装置４１０を介して文字列の修正を行わせ
るようにしてもよい。このようにすると、不明瞭な音声
／文字変換を正確な変換内容とすることができる。な
お、ＣＰＵ１１０のプログラムとして、かな漢字変換プ
ログラムを搭載している場合には、これを利用して、Ｃ
ＰＵ１１０により、テキスト変換された文字列につい
て、かな漢字変換をさらに行うことができる。この際に
は、変換後のかな漢字混じり文を、表示装置４２０に表
示して、受付者に確認させると共に、入力装置を介し
て、必要な編集を行わせるようにすることができる。従
って、より読みやすい文章で、顧客の発言内容を記録す
ると共に、参照を容易にすることができる。Next, when it is necessary to convert the audio data recorded in the recording device 200 into text data, upon receiving an instruction from the input device 410, the CPU 1
Reference numeral 10 denotes customer voice data 21 recorded in the voice data file 210 for the designated customer from the voice data file 210 stored in the recording device 200.
2 is transferred to the voice recognition unit 330 by the length of the processing. The voice recognition unit 330 performs a recognition process on the transferred voice data using a voice pattern dictionary prepared in advance, and converts the recognition result into a character code. Then, a character string for the transferred portion is generated. The generated character string is transmitted to the audio data file 210 by the CPU 110.
Is stored as text data 214. At this time, a character string may be corrected via the input device 410 while being displayed on the display device 420 so that the receptionist can confirm the conversion state. In this way, the unclear voice / character conversion can be converted into accurate conversion content. If a kana-kanji conversion program is installed as a program for the CPU 110, the
With the PU 110, kana-kanji conversion can be further performed on the character string subjected to the text conversion. In this case, the kana-kanji mixed sentence after the conversion can be displayed on the display device 420 so that the receptionist can confirm the sentence and make the necessary editing via the input device. Therefore, the contents of the customer's remark can be recorded in a more readable sentence, and the reference can be made easier.

【００６６】また、メモリ１２０に、プログラムとし
て、翻訳プログラムを搭載している場合には、かな漢字
混じり文に変換されたデータを、特定の言語の文章に翻
訳することもできる。もちろん、英語等の外国語で受け
付けた場合には、それをかな漢字混じり文、または、他
の外国語文に変換することもできる。When a translation program is installed as a program in the memory 120, the data converted into a sentence mixed with kana-kanji can be translated into a sentence in a specific language. Of course, when accepted in a foreign language such as English, it can be converted to a sentence mixed with kana-kanji or another foreign language sentence.

【００６７】次に、テキストデータ化された音声データ
を第３識別子自動生成部３６０で受けて、それらのテキ
ストデータ中に、予め登録したキーワードが含まれてい
るかを調べ、キーワードが含まれている場合には、その
表示態様を変更できるように、その部分の表示属性を特
定の態様に設定する。例えば、表示色、背景色、点滅、
網掛け、アンダーライン等を付するように設定する。Next, the voice data converted into text data is received by the third identifier automatic generation section 360, and it is checked whether or not the text data includes a keyword registered in advance, and the keyword is included. In this case, the display attribute of the portion is set to a specific mode so that the display mode can be changed. For example, display color, background color, blinking,
Set to add shading, underline, etc.

【００６８】また、上述した例では、テキストデータ自
体に識別子を付する例を示しているが、本発明はこれに
限られない。例えば、前記キーワードに該当する文字列
が含まれている場合、それに対応する一まとまりの音声
データについて、当該キーワードを対応付けて、当該キ
ーワードを一まとまりの音声データについての識別子と
して用いることもできる。この場合、音声データと対応
するテキストデータの両方の識別子として機能させるこ
ともできる。この際の対応付けにおいて、前述したタイ
ムスタンプおよび格納アドレスのうち少なくとも一方を
さらに対応付けることができる。特に、タイムスタンプ
と対応付けることで、対話の発言順を整理することに便
利となる。Further, in the above example, an example is shown in which an identifier is added to text data itself, but the present invention is not limited to this. For example, when a character string corresponding to the keyword is included, a set of audio data corresponding to the keyword can be associated with the keyword, and the keyword can be used as an identifier for the set of audio data. In this case, it can be made to function as an identifier of both the voice data and the corresponding text data. At this time, at least one of the above-described time stamp and storage address can be further associated. In particular, by associating with a time stamp, it is convenient to arrange the speech order of the dialogue.

【００６９】このようにして、記録装置２００に、電話
で受け付けた顧客との対話を、顧客音声データ、受付者
音声データ、識別子データ、および、テキストデータと
して、同一の顧客音声データファイル２１０に格納され
る。従って、後に、処理者は、必要な顧客音声データフ
ァイルを参照することで、顧客の音声を聞くことがで
き、また、テキストデータで内容を知ることもでき、さ
らに、識別子を利用することで、必要な箇所のみ音声デ
ータを再生することができる。従って、能率的に必要な
情報を取得できる。また、テキストデータを表示させ
て、または、図示していないプリンタで印刷して、利用
することもできる。その際、特定の用語については、表
示態様の変更がなされているため、特定のキーワードの
検索が容易となる。As described above, in the recording device 200, the dialogue with the customer received by telephone is stored in the same customer voice data file 210 as customer voice data, receptionist voice data, identifier data, and text data. Is done. Therefore, the processor can later hear the customer's voice by referring to the necessary customer voice data file, can also know the content by text data, and further, by using the identifier, The audio data can be reproduced only in the necessary places. Therefore, necessary information can be efficiently acquired. In addition, text data can be displayed or printed by a printer (not shown) for use. At this time, since the display mode of the specific term has been changed, the search for the specific keyword becomes easy.

【００７０】以上の例は、音声データを録音後にバッチ
処理として、テキストデータに変換する例である。本発
明は、これに限定されない。例えば、電話で受け付け中
に、リアルタイムで、音声データをテキストデータに変
換するようにしてもよい。その場合、顧客の音声データ
をすべてテキストに置き換えることもできるが、より簡
便な方法を採ることもできる。例えば、顧客の発言内容
に対して、それに含まれる重要な用語について、受付者
が再確認の音声をマイクに向かって行い、この受付者の
音声を音声認識部３３０で文字列に変換するようにして
もよい。この場合には、音声認識部３３０を特定話者用
に学習させておくことで、認識率を向上させることがで
きる。しかも、受付者が音声認識を意識して明瞭に発音
することで、さらに、認識率を向上することができる。
しかも、受付者が発生した重要語について、第３識別子
自動生成部３６０で自動的に識別子を生成する要求を出
力させることができる。In the above example, voice data is converted into text data as a batch process after recording. The present invention is not limited to this. For example, audio data may be converted to text data in real time during reception by telephone. In that case, all voice data of the customer can be replaced with text, but a simpler method can be adopted. For example, with respect to the content of the customer's remarks, the receptionist performs a reconfirmation voice into the microphone for important terms contained therein, and converts the voice of the receptionist into a character string by the voice recognition unit 330. You may. In this case, the recognition rate can be improved by learning the voice recognition unit 330 for a specific speaker. Moreover, the recognition rate can be further improved because the receptionist is conscious of speech recognition and pronounces clearly.
In addition, the third identifier automatic generation unit 360 can output a request for automatically generating an identifier for an important word generated by the acceptor.

【００７１】また、この他の音声認識の態様としては、
例えば、フォワードキーとリバースキーに挟まれた部分
の音声データについて、自動的に音声認識させるように
することができる。このようにすれば、重要な発言内容
についてテキストデータ化することができるので、処理
者が重要部分を探索することが容易になる。As another mode of voice recognition,
For example, it is possible to automatically perform voice recognition on the voice data between the forward key and the reverse key. In this way, since important remark contents can be converted into text data, it becomes easy for the processor to search for important parts.

【００７２】次に、本発明の他の実施の形態について説
明する。図６は、そのハードウエアシステム構成を示
す。なお、本実施の形態は、音声認識部３３０、第３識
別子自動生成部３６０、音量検出部３４０および第１識
別子自動生成部３５０が、顧客および受付者にそれぞれ
対応して二つ設けられていること、記録装置２００にお
いて、テキストデータが顧客テキスト２１４および受付
者テキスト２１５とにわかれている点において相違する
他は、前述した図３に示す実施の形態と同様のハードウ
エアシステム構成となっている。Next, another embodiment of the present invention will be described. FIG. 6 shows the hardware system configuration. In the present embodiment, two voice recognition units 330, third identifier automatic generation units 360, sound volume detection units 340, and first identifier automatic generation units 350 are provided for each of the customer and the receiver. The recording apparatus 200 has the same hardware system configuration as the embodiment shown in FIG. 3 described above, except that the text data is divided into a customer text 214 and a receptionist text 215. .

【００７３】情報処理装置１００には、インタフェース
１３０を介して、種々の機器が接続される。すなわち、
記録装置２００と、アナログディジタル変換器（Ａ／
Ｄ）３１０と、アナログディジタル変換器（Ａ／Ｄ）３
２０と、音声認識部３３０ａおよび３３０ｂ、音量検出
部３４０ａおよび３４０ｂ、第１識別子３５０ａおよび
３５０ｂと、第３識別子自動生成部３６０ａおよび３６
０ｂと、音声合成部３７０と、ディジタルアナログ変換
器（Ｄ／Ａ）３８０とが、情報処理装置１００に内蔵さ
れる形で接続される。もちろん、これらの機器は、それ
ぞれ外付け装置としてもよい。また、情報処理装置１０
０には、入力装置４１０と、表示装置４２０と、通信制
御装置４３０と、音声出力部４４０とが接続される。な
お、本実施の形態では、音声認識部３３０ａおよび３３
０ｂ、音量検出部３４０ａおよび３４０ｂ、第１識別子
３５０ａおよび３５０ｂと、第３識別子自動生成部３６
０ａおよび３６０ｂとが、顧客用と受付者用とにそれぞ
れ設けられているが、本発明はこれに限られない。一つ
で顧客用と受付者用とを兼用する構成としてよい。ま
た、３者以上の話者に対応するために、それぞれ３者に
対応するよう音声認識部等を多重構成としてもよい。Various devices are connected to the information processing apparatus 100 via the interface 130. That is,
A recording device 200 and an analog / digital converter (A /
D) 310 and an analog / digital converter (A / D) 3
20, voice recognition units 330a and 330b, volume detection units 340a and 340b, first identifiers 350a and 350b, and third identifier automatic generation units 360a and 36.
0b, a voice synthesizing unit 370, and a digital-to-analog converter (D / A) 380 are connected so as to be built in the information processing apparatus 100. Of course, each of these devices may be an external device. The information processing device 10
The input device 410, the display device 420, the communication control device 430, and the audio output unit 440 are connected to 0. In the present embodiment, the voice recognition units 330a and 33a
0b, volume detectors 340a and 340b, first identifiers 350a and 350b, and third identifier automatic generator 36
Although 0a and 360b are provided for the customer and the receiver, respectively, the present invention is not limited to this. A single configuration may be used for both customers and receptionists. Further, in order to correspond to three or more speakers, a voice recognition unit and the like may be multiplexed to correspond to three persons, respectively.

【００７４】記録装置２００は、前記第１の実施の形態
と同様に、ハードディスク装置で構成される。この記録
装置２００には、電話をかけてきた顧客ごとに顧客音声
ファイル２１０と、顧客の住所、電話番号、各種属性デ
ータ等を含む顧客データ２９０とが設けられる。顧客音
声ファイル２１０には、ディジタル化された顧客の音声
信号である顧客音声データ２１２と、ディジタル化され
た受付者の音声信号である受付者音声データ２１３と、
音声認識して得られた顧客テキストデータ２１４と、受
付者テキストデータ２１５と、後述する識別子データ２
１１とが記録される。The recording device 200 is constituted by a hard disk device, as in the first embodiment. The recording device 200 is provided with a customer voice file 210 for each customer who makes a call, and customer data 290 including the customer's address, telephone number, various attribute data, and the like. The customer voice file 210 includes customer voice data 212 that is a digitized voice signal of the customer, and receiver voice data 213 that is a digitized voice signal of the receiver.
Customer text data 214 obtained by voice recognition, receptionist text data 215, and identifier data 2 described later
11 is recorded.

【００７５】なお、本実施の形態では、識別子データ２
１１、顧客音声データ２１２、受付者データ２１３、顧
客テキストデータ２１４および受付者テキストデータ２
１５を同じファイルに格納している。しかし、本発明
は、これに限られない。一部または全部のデータを、顧
客毎にまとめずに、それ自体独立した形で格納するよう
にしてもよい。例えば、識別子データ２１１を顧客デー
タ２９０のようにそれ自体で格納するようにしてもよ
い。また、識別子データ２１１、顧客音声データ２１
２、受付者データ２１３、顧客テキストデータ２１４お
よび受付者テキストデータ２１５をすべて独立に格納す
るようにしてもよい。In this embodiment, the identifier data 2
11, customer voice data 212, receptionist data 213, customer text data 214, and receptionist text data 2
15 are stored in the same file. However, the present invention is not limited to this. Some or all of the data may be stored in an independent form without being collected for each customer. For example, the identifier data 211 may be stored by itself like the customer data 290. The identifier data 211 and the customer voice data 21
2. The acceptor data 213, the customer text data 214, and the acceptor text data 215 may all be stored independently.

【００７６】音声認識部３３０ａおよび３３０ｂは、デ
ィジタル化された音声信号を読み込んで、その音声を対
応する文字ないし文字列に変換して、テキストデータを
生成する。本実施の形態では、独立した装置として接続
されている。もちろん、ＣＰＵ１１０によって処理する
ようにしてもよい。本実施の形態では、音声認識部３３
０ａは、顧客音声の認識を行わせるため、不特定話者対
応のものを用いる。また、音声認識部３３０ｂは、受付
者音声の認識を行わせるため、特定話者対応のものを用
いる。もちろん、共に、不特定話者対応のものとしても
よい。The voice recognition units 330a and 330b read the digitized voice signal, convert the voice into a corresponding character or character string, and generate text data. In the present embodiment, they are connected as independent devices. Of course, you may make it process by CPU110. In the present embodiment, the voice recognition unit 33
0a is used for an unspecified speaker in order to make customer voice recognition. In addition, the voice recognition unit 330b uses a voice corresponding to a specific speaker in order to perform recognition of the voice of the receiver. Of course, both of them may be adapted to an unspecified speaker.

【００７７】音量検出部３４０ａおよび３４０ｂは、音
声が途切れた後にレベルが急激に増加する点を検出する
機能を有する。すなわち、この音量検出部３４０ａは、
顧客の音声信号のレベルを検出して、予め定めた基準値
以上かを判定し、基準値未満のときは、発言が途切れて
いると判定し、その後、基準値を超える状態となったと
き、発言が始まったと判定して、頭出し信号を出力す
る。図４は、その一例を示す。例えば、時刻ｔ４の前で
は、しばらくの間、音量レベルが低い状態が続き（無音
状態）、時刻ｔ４において、急激に音量レベルが増大し
ている。音量検出部３４０ａは、この状態を検出して、
先頭信号を出力する。この先頭信号は、第３識別子自動
生成部３６０ａに入力される。また、音量検出部３４０
ｂは、受付者についての音量検出を行って、上述した顧
客の場合と同様にして、と切れ後の先頭信号を検出し
て、第３識別子自動生成部３４０ｂに当該先頭信号を送
る。なお、本実施の形態では、顧客と受付者の両者につ
いて、それぞれ音量検出を行っているが、一方のみ、例
えば、顧客の音声についてのみ検出するようにしてもよ
い。The volume detectors 340a and 340b have a function of detecting a point at which the level sharply increases after the sound is interrupted. That is, the volume detector 340a
Detect the level of the voice signal of the customer, determine whether it is equal to or greater than a predetermined reference value, if it is less than the reference value, determine that the speech is interrupted, and then when the state exceeds the reference value, It determines that the speech has begun and outputs a cue signal. FIG. 4 shows an example. For example, before time t4, the volume level remains low for a while (silence state), and at time t4, the volume level sharply increases. The volume detector 340a detects this state,
Outputs the first signal. This head signal is input to the third identifier automatic generation unit 360a. Also, the volume detection unit 340
b detects the volume of the receptionist, detects the leading signal after the disconnection in the same manner as in the case of the customer described above, and sends the leading signal to the third identifier automatic generation unit 340b. In the present embodiment, the sound volume is detected for both the customer and the receptionist. However, only one of them, for example, the voice of the customer may be detected.

【００７８】第１識別子自動生成部３５０ａおよび３５
０ｂは、それぞれ対応する前記音量検出部３４０ａおよ
び３４０ｂの出力信号を受けて、その時点における第１
識別子を生成する。この識別子については後述する。生
成した識別子は、それぞれ対応する顧客音声データファ
イル２１０に格納される。First identifier automatic generation units 350a and 35a
0b receives the output signals of the corresponding volume detectors 340a and 340b, respectively,
Generate an identifier. This identifier will be described later. The generated identifiers are stored in the corresponding customer voice data files 210, respectively.

【００７９】第３識別子生成部３６０ａおよび３６０ｂ
は、対応する音声認識部３３０ａおよび３３０ｂにおい
てテキストデータに変換された音声テキストデータにつ
いて、予め設定してあるキーワードを含むかを判定し
て、含む場合に、第３識別子を生成する。Third identifier generating sections 360a and 360b
Determines whether or not the speech text data converted into the text data in the corresponding speech recognition units 330a and 330b includes a preset keyword, and generates a third identifier when the keyword is included.

【００８０】音声合成部３７０は、テキストデータを読
み込んで、予め定めた音質の音声信号に変換する。この
音声信号は、アナログデータであり、音声出力部４４０
により音声として出力される。The voice synthesizing section 370 reads the text data and converts it into a voice signal having a predetermined sound quality. This audio signal is analog data, and is output from the audio output unit 440.
Is output as voice.

【００８１】Ｄ／Ａ３８０は、ディジタルデータの形式
で記録されている顧客音声および受付者音声を、アナロ
グ信号に変換する。変換されたアナログ音声信号は、音
声出力部４４０により音声として出力される。The D / A 380 converts customer voice and receptionist voice recorded in the form of digital data into analog signals. The converted analog audio signal is output as audio by the audio output unit 440.

【００８２】音声出力装置４４０は、音声合成部３７０
から出力されるアナログ音声信号を音響に変換して出力
する。具体的には、増幅器とスピーカとで構成される。
なお、ヘッドフォンにより構成することもできる。The voice output device 440 is provided with a voice synthesizer 370
The analog audio signal output from is converted to sound and output. Specifically, it is composed of an amplifier and a speaker.
In addition, it can also be comprised by headphones.

【００８３】次に、本実施の形態における受付処理の動
作について、上記図６の他、図５のフローチャートを参
照して説明する。ただし、基本的な処理手順は、前述し
た図３の実施の形態と同じであるので、ここでは、相違
点を中心に説明する。Next, the operation of the reception process in the present embodiment will be described with reference to the flowchart of FIG. 5 in addition to FIG. However, the basic processing procedure is the same as that of the embodiment of FIG. 3 described above, and therefore, the description will focus on the differences.

【００８４】本実施の形態では、前述の実施の形態と同
様に、ステップ１００１〜１００９の処理を実行する。
これらの処理については、説明を繰り返さない。In the present embodiment, the processing of steps 1001 to 1009 is executed as in the above-described embodiment.
The description of these processes will not be repeated.

【００８５】また、識別子生成要求があるかを調べる
（ステップ１０１０）。これは、第１識別子自動生成部
３５０ａまたは３５０ｂからの識別子入力、および、入
力装置４１０からの識別子（第２識別子）生成要求の有
無を調べる。第１識別子自動生成部３５０ａまたは３５
０ｂからの識別子生成要求信号を検出すると、ＣＰＵ１
１０は、その時点の時刻（年月日を含む）を、内蔵する
カレンダから取得し、さらに、その時点で、記録を行う
べき音声データの格納先アドレスを取得して、これらを
対応付けて、識別子の予め設定された属性、この場合に
は、フォワードキーを示す属性情報と共に、当該顧客音
声データファイル２１０に識別子データ２１１として格
納する（ステップ１０１１）。この後、ステップ１０１
２、１０１３の各処理を実行する。It is checked whether there is an identifier generation request (step 1010). This is done by checking the identifier input from the first identifier automatic generation unit 350a or 350b and the presence / absence of an identifier (second identifier) generation request from the input device 410. First identifier automatic generation unit 350a or 35
0b, the CPU 1
10 obtains the time (including year, month and day) at that time from a built-in calendar, further obtains a storage destination address of audio data to be recorded at that time, The identifier is stored in the customer voice data file 210 as identifier data 211 together with a preset attribute of the identifier, in this case, attribute information indicating the forward key (step 1011). Thereafter, step 101
Steps 2 and 1013 are executed.

【００８６】次に、テキストデータ化された音声データ
に識別子を対応付ける場合について述べる。すなわち、
この場合には、対応する第３識別子自動生成部３６０ａ
または３６０ｂで、該当するテキストデータ中に、予め
登録したキーワードが含まれているかを調べ、キーワー
ドが含まれている場合には、その表示態様を変更できる
ように、その部分の表示属性を特定の態様に設定する。
例えば、表示色、背景色、点滅、網掛け、アンダーライ
ン等を付するように設定する。Next, a case will be described in which an identifier is associated with voice data converted into text data. That is,
In this case, the corresponding third identifier automatic generation unit 360a
Alternatively, at 360b, it is checked whether or not the corresponding text data includes a keyword registered in advance, and if the keyword is included, the display attribute of the part is specified by a specific attribute so that the display mode can be changed. Set the mode.
For example, the display color, the background color, the blinking, the shading, the underline, and the like are set.

【００８７】また、本実施の形態において、例えば、顧
客の発言内容に対して、それに含まれる重要な用語につ
いて、受付者が再確認の音声をマイクに向かって行い、
この受付者の音声を音声認識部３３０ｂで文字列に変換
するようにしてもよい。この場合には、音声認識部３３
０ｂを特定話者用に学習させておくことで、認識率を向
上させることができる。しかも、受付者が音声認識を意
識して明瞭に発音することで、さらに、認識率を向上す
ることができる。また、受付者が発生した重要語につい
て、第３識別子自動生成部３６０ｂで自動的に識別子を
生成する要求を出力させることができる。この場合、こ
の重要後がキーワードとして、それを含む音声データの
一まとまりと対応付けられることとなる。In the present embodiment, for example, the receptionist performs a reconfirmation voice to the microphone with respect to important terms contained in the contents of remarks of the customer,
The voice of the acceptor may be converted into a character string by the voice recognition unit 330b. In this case, the voice recognition unit 33
By learning 0b for a specific speaker, the recognition rate can be improved. Moreover, the recognition rate can be further improved because the receptionist is conscious of speech recognition and pronounces clearly. In addition, the third identifier automatic generation unit 360b can output a request for automatically generating an identifier for an important word generated by the acceptor. In this case, the important keyword is associated with a group of audio data including the keyword.

【００８８】また、この他の音声認識の態様としては、
前述したように、例えば、フォワードキーとリバースキ
ーに挟まれた部分の音声データについて、自動的に音声
認識させるようにすることができる。Further, as another mode of voice recognition,
As described above, for example, the voice data of the portion sandwiched between the forward key and the reverse key can be automatically recognized.

【００８９】本実施の形態では、音声認識部等が話者対
応に複数設けられているため、各話者について、独立に
処理することができる。このため、処理すべき情報量が
大きくても、迅速に対応することが可能となる。In this embodiment, since a plurality of voice recognition units and the like are provided for each speaker, each speaker can be processed independently. Therefore, even if the amount of information to be processed is large, it is possible to respond quickly.

【００９０】以上の各例では、音声認識を同一の情報処
理装置１００において行うようにしているが、もちろ
ん、これに限定されない。例えば、別の情報処理装置に
おいて行うようにしてもよい。In each of the above examples, speech recognition is performed in the same information processing apparatus 100. However, the present invention is not limited to this. For example, it may be performed in another information processing device.

【００９１】また、以上の各例では、Ａ／Ｄを用いてい
るが、ディジタルデータの形で音声データが情報処理装
置１００に入力される場合には、Ａ／Ｄは、省略するこ
とができる。この場合、スピーカ５３１で再生するため
Ｄ／Ａを用いる。また、マイク５３２からの入力をディ
ジタル化するため、Ａ／Ｄを用いる。In each of the above examples, A / D is used. However, when audio data is input to the information processing apparatus 100 in the form of digital data, A / D can be omitted. . In this case, D / A is used for reproduction by the speaker 531. A / D is used to digitize the input from the microphone 532.

【００９２】さらに、上述した各例では、顧客と受付者
の両者の対話内容を記録している。しかし、対話の一方
の発言者の発言内容のみを記録するようにしてもよい。
例えば、顧客の音声のみを記録するようにしてもよい。Further, in each of the above-described examples, the content of the dialogue between the customer and the receptionist is recorded. However, it is also possible to record only the utterance contents of one speaker of the dialogue.
For example, only the voice of the customer may be recorded.

【００９３】この他、上述した例では、電話による受付
の場合を示しているが、電話を用いない場合、すなわ
ち、窓口等で直接に対話する場合にも適用可能である。
この場合、話者の区別を明確にするため、予め使用する
マイクを別に用意して、マイク入力の相違によって話者
を区別するようにしてもよい。[0093] In addition, although the above-mentioned example shows the case of receiving by telephone, the present invention is also applicable to a case where no telephone is used, that is, a case where direct conversation is performed at a window or the like.
In this case, in order to clarify the distinction of the speakers, a microphone to be used may be separately prepared in advance, and the speakers may be distinguished based on differences in microphone inputs.

【００９４】[0094]

【発明の効果】本発明によれば、苦情、問い合せ、故障
の申告、事故の通報等を音声により受け付けることがで
き、かつ、受け付けた情報を記録し、しかも、特定の箇
所には、識別子を設けることができるため、後の処理に
利用可能な形態で処理者に伝達することが可能となる。According to the present invention, a complaint, an inquiry, a report of a failure, a report of an accident, etc. can be received by voice, the received information is recorded, and an identifier is provided at a specific place. Since it can be provided, it can be transmitted to the processor in a form that can be used for later processing.

[Brief description of the drawings]

【図１】本発明の音声受付処理システムが適用される
環境の概要を示す説明図。FIG. 1 is an explanatory diagram showing an outline of an environment to which a voice reception processing system of the present invention is applied.

【図２】本発明が適用される受付センタにおける情報
の授受の概要を示す説明図。FIG. 2 is an explanatory diagram showing an outline of information exchange at a reception center to which the present invention is applied.

【図３】本発明の実施の形態において使用するハード
ウエアシステムの構成の一例を示すブロック図。FIG. 3 is a block diagram showing an example of a configuration of a hardware system used in the embodiment of the present invention.

【図４】本発明において用いられる音量検出部による
識別子の識別子生成の一例を示す説明図。FIG. 4 is an explanatory diagram showing an example of identifier generation of an identifier by a volume detector used in the present invention.

【図５】本実施の形態における対話記録処理の手順を
示すフローチャート。FIG. 5 is a flowchart illustrating a procedure of a dialog recording process according to the embodiment;

【図６】本発明の他の実施の形態において使用するハ
ードウエアシステムの構成の一例を示すブロック図。FIG. 6 is a block diagram showing an example of a configuration of a hardware system used in another embodiment of the present invention.

【符号の説明】１００…情報処理装置、１１０…中央処理装置（ＣＰ
Ｕ）、１２０…メモリ、１３０…インタフェース、２０
０…記録装置、２１０…顧客音声データファイル、２１
１…識別子データ、２１２…顧客音声データ、２１３…
受付者音声データ、２１４…テキストデータ、３１０、
３２０…アナログディジタル変換器（Ａ／Ｄ）、３３
０、３３０ａ、３３０ｂ…音声認識部、３４０、３４０
ａ、３４０ｂ…音量検出部、３５０、３５０ａ、３５０
ｂ…第１識別子自動生成部、３６０、３６０ａ、３６０
ｃ…第３識別子自動生成部、４１０…入力装置、４２０
…表示装置、４３０…通信制御装置、５００…電話装
置、５１０…回線制御部、５２０…送受話装置、５３１
…スピーカ、５３２…マイク。[Description of Signs] 100: Information processing device, 110: Central processing unit (CP
U), 120 ... memory, 130 ... interface, 20
0: recording device, 210: customer voice data file, 21
1 ... Identifier data, 212 ... Customer voice data, 213 ...
Recipient voice data, 214 ... text data, 310,
320 ... analog-to-digital converter (A / D), 33
0, 330a, 330b ... voice recognition units, 340, 340
a, 340b: Volume detection unit, 350, 350a, 350
b: first identifier automatic generation unit, 360, 360a, 360
c: third identifier automatic generation unit, 410: input device, 420
... display device, 430 ... communication control device, 500 ... telephone device, 510 ... line control unit, 520 ... transmit / receive device, 531
... speaker, 532 ... microphone.

───────────────────────────────────────────────────── フロントページの続き (72)発明者井上秀夫神奈川県横浜市戸塚区品濃町504番地２日立電子サービス株式会社内 (72)発明者山岸令和神奈川県横浜市戸塚区品濃町504番地２日立電子サービス株式会社内 (72)発明者武貞睦治神奈川県横浜市戸塚区品濃町504番地２日立電子サービス株式会社内Ｆターム(参考） 5B075 KK02 KK34 KK35 ND03 ND14 ND24 ND26 NK02 NK13 NK24 NK31 UU40 ──────────────────────────────────────────────────続き Continued on the front page (72) Inventor Hideo Inoue 504-2 Shinanomachi, Totsuka-ku, Yokohama-shi, Kanagawa Prefecture Inside Hitachi Electronics Service Co., Ltd. (72) Inventor Reika Yamagishi 504-2 Shinanocho, Totsuka-ku, Yokohama-shi, Kanagawa Prefecture Within Hitachi Electronics Service Co., Ltd. (72) Inventor Mutsuji Takesada 504-2 Shinanomachi, Totsuka-ku, Yokohama-shi, Kanagawa Prefecture F-term (reference) 5B075 KK02 KK34 KK35 ND03 ND14 ND24 ND26 NK02 NK13 NK24 NK31 UU40

Claims

[Claims]

1. A dialogue recording system for recording contents of at least one of dialogues, comprising: a recording device for recording voice data of a dialogue; and an identifier for identifying a specific part of the voiced data to be recorded. And an information processing device that performs a process of causing the recording device to perform recording. The information processing device requests the generation of an identifier for identifying a specific portion of the audio data recorded in the recording device. A dialogue recording system which receives an identifier, generates an identifier, and records the identifier in the recording device in association with audio data to be recorded.

2. A dialogue recording system according to claim 1, wherein a volume detection means for detecting a rise of the voice after the voice is interrupted, based on a volume of the input voice, and a voice input at that time. Means for outputting an identifier generation request signal for identifying data, wherein the information processing apparatus receives the identifier generation request signal and generates an identifier.

3. The dialogue recording system according to claim 1, further comprising an input device having a key for requesting generation of an identifier, wherein said information processing device is configured to output a signal for requesting generation of an identifier from said input device. Receiving the information, the identifier is generated.

4. The dialog recording system according to claim 3, wherein the identifier generation request signal from the input device identifies an area recorded in the future in time from the position of the identifier, and A dialogue recording system characterized in that an area recorded in the past is identified from the position of the identifier, and the input device has a function of inputting each of the areas.

5. The dialogue recording system according to claim 1, wherein the information processing device includes, as the identifier, time information at a time point when the identifier generation request signal is received. A dialogue recording system characterized by using.

6. The dialogue recording system according to claim 1, wherein the information processing device receives the identifier generation request signal as the identifier and outputs a corresponding voice. A dialog recording system using an address for recording data in the recording device.

7. The dialogue recording system according to claim 1, further comprising: voice recognition means for performing voice recognition on digitized voice data and converting the digitized voice data into a character string. The information recording apparatus according to claim 1, wherein the information processing apparatus causes the recording device to record the converted character string as text data.

8. The dialogue recording system according to claim 7, wherein the character string converted from the voice by the voice recognition unit is checked to determine whether or not the character string includes a predetermined keyword. A dialog recording system, further comprising means for changing a display mode of a corresponding part.

9. The dialogue recording system according to claim 1, wherein the voice recognition unit performs voice recognition on the digitized voice data and converts the voice data into a character string. The character string converted from the voice by the voice recognition means is examined to determine whether it includes a predetermined keyword. If the keyword includes a keyword, the keyword is associated with a set of voice data corresponding to the keyword, and the keyword is identified. A dialogue recording system, further comprising: means for setting an identifier for a group of voice data.

10. The method of claim 1, 2, 3, 4, 5, 6, 7,
The dialogue recording system according to any one of claims 8 and 9, wherein the information processing device records each voice of a speaker of the dialogue in a different area in the recording device. Recording system.

11. A telephone device for receiving a call from the outside and making a telephone call, and the telephone device according to claim 1, 2, 3, 4, 5, 6, 7,
11. A reception processing system comprising the dialogue recording system according to any one of 8, 9, and 10, wherein the telephone device includes a microphone and a speaker for talking, and receives a voice signal and the microphone. And an audio signal to the information processing apparatus, and the information processing apparatus causes the recording apparatus to record the input audio signal.