WO2020187301A1 - 节目名检索辅助装置以及节目名检索辅助方法 - Google Patents
节目名检索辅助装置以及节目名检索辅助方法 Download PDFInfo
- Publication number
- WO2020187301A1 WO2020187301A1 PCT/CN2020/080259 CN2020080259W WO2020187301A1 WO 2020187301 A1 WO2020187301 A1 WO 2020187301A1 CN 2020080259 W CN2020080259 W CN 2020080259W WO 2020187301 A1 WO2020187301 A1 WO 2020187301A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- program name
- program
- name
- retrieval
- text data
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 20
- 238000004458 analytical method Methods 0.000 claims description 20
- 238000007405 data analysis Methods 0.000 claims description 14
- 238000004891 communication Methods 0.000 abstract description 13
- 238000006467 substitution reaction Methods 0.000 abstract description 4
- 238000006243 chemical reaction Methods 0.000 description 12
- 238000010586 diagram Methods 0.000 description 8
- 230000005540 biological transmission Effects 0.000 description 7
- 239000000284 extract Substances 0.000 description 5
- 230000006870 function Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003825 pressing Methods 0.000 description 1
- 230000001902 propagating effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/435—Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/439—Processing of audio elementary streams
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/482—End-user interface for program selection
Definitions
- the present embodiment relates to a program name search support device and a program name search support method.
- EPG Electronic Program Guide
- broadcast waves In AV equipment such as televisions, EPG (Electronic Program Guide) obtained from broadcast waves is generally used to search for broadcast programs.
- search keywords In the case of searching for a broadcast program name using search keywords, conventionally, a button of a remote control is operated to input a character to be searched.
- Patent Document 1 JP 2012-168349 A
- the purpose of the present embodiment is to provide a program name search support device and a program name search support method that can improve the accuracy of program name search by voice search.
- the program name search support device of the present embodiment has: a program information storage unit that stores the first program name and the pronunciation of the program name as a pair of the program name tag stored in the program name search target device; text data acquisition A circuit for acquiring the second program name that is text data obtained by performing voice recognition processing for the voice data based on the pronunciation of the program name stored in the program information storage unit; and a replacement dictionary, in the first When the 1 program name is different from the second program name, the first program name and the second program name are stored as a pair.
- FIG. 1 is a schematic diagram showing an example of the structure of a program name search system using the program name search support device of this embodiment
- FIG. 2 is a block diagram showing the structure of a language processing server as an example of the program name search support device of the embodiment
- Figure 3 is a diagram illustrating an example of new program information data
- Figure 4 is a diagram illustrating an example of program name replacement data
- Fig. 5 is a flowchart illustrating an example of a method of creating a replacement dictionary.
- 1... Language processing server 2... Broadcast receiving device, 3... Voice recognition server, 4... Internet line, 21... Remote control, 51... Text data communication circuit, 52... Language data analysis circuit, 53... Command output circuit, 54... Replacement dictionary registration circuit, 55...speech data transmission circuit, 56...speech conversion circuit, 61...new program information database, 62...replacement dictionary.
- FIG. 1 is a schematic diagram showing an example of the structure of a program name search system using the program name search support device of the present embodiment.
- the program name retrieval system includes a language processing server 1, a broadcast receiving device 2 as a device for program name retrieval, and a voice recognition server 3.
- the language processing server 1, the broadcast receiving device 2, and the voice recognition server 3 are connected to each other via the Internet line 4.
- the broadcast receiving device 2 receives and broadcasts programs provided by broadcast operators through radio waves propagating in space, and programs provided by distribution operators through networks such as cable networks and IP networks.
- the broadcast receiving device 2 receives an operation instruction from the user via a remote controller (hereinafter referred to as a remote controller) 21.
- the broadcast receiving device 2 may be a structure including a recording and playing device that records the received program.
- the remote controller 21 includes operation keys (number keys, arrow keys, color buttons, etc.) and a microphone.
- operation keys number keys, arrow keys, color buttons, etc.
- the user can transmit voice data to the broadcast receiving device 2 via the microphone of the remote control 21 by performing a predetermined operation on the remote control 21 such as pressing a microphone button which is one of the operation keys. That is, the user can input an operation instruction to the broadcast receiving device 2 using voice data.
- the voice recognition server 3 is a server that provides cloud-based voice recognition services.
- the voice recognition server 3 converts voice data sent from a device connected to the Internet line 4 into text data and outputs it.
- the language processing server 1 as a program name search support device has a processor 11.
- the language processing server 1 analyzes the text data input from the broadcast receiving device 2 or the speech recognition server 3, extracts the operation content and parameters of the broadcast receiving device 2, and converts them into a format that can be processed by the broadcast receiving device 2 and outputs it.
- FIG. 2 is a block diagram showing the structure of the language processing server 1.
- the language processing server 1 includes circuits such as a text data communication circuit 51, a language data analysis circuit 52, a command output circuit 53, an alternative dictionary registration circuit 54, a speech data transmission circuit 55, and a speech conversion circuit 56.
- the language processing server 1 also includes storage devices such as a new program information database 61 and a replacement dictionary 62.
- the functions of the text data communication circuit 51, the language data analysis circuit 52, the instruction output circuit 53, the replacement dictionary registration circuit 54, the voice data transmission circuit 55, and the voice conversion circuit 56 can be performed by the CPU (Central Processing Unit) as the processor 11 :Central processing unit) is realized by software, and can also be realized by hardware such as FPGA.
- CPU Central Processing Unit
- FPGA Field Programmable Gate Array
- the text data communication circuit 51 as a text data acquisition circuit controls the transmission and reception of text data with devices connected to the Internet line 4 (for example, the broadcast receiving device 2, the voice recognition server 3).
- the text data output from the voice recognition server 3 is acquired, or the electronic program guide (EPG) information transmitted from a broadcasting station or the like is acquired.
- EPG electronic program guide
- the text data communication circuit 51 obtains new program information from another server not shown, the information is registered in the new program information database 61.
- the so-called new program information is information of a program having a newly appeared program name that has not been acquired so far in the EPG information.
- Fig. 3 is a diagram illustrating an example of new program information data.
- the new program information data shown in FIG. 3 is registered in the new program information database 61.
- the new program information data creates one record for each program, and each record includes, for example, "program name”, "reading” as the reading (the reading of the program name in hiragana), and " (Information) Three items of "Acquisition Day”.
- program name and "pronunciation”
- acquisition date set the date when the recorded new program information is received from other servers.
- the language data analysis circuit 52 performs natural language analysis processing such as morpheme analysis and grammatical analysis on the text data (text constituted as a natural sentence) obtained in the text data communication circuit 51 as necessary to grasp the semantic content (operation content) of the text data ). For example, when text data such as "I want to watch ⁇ (program name)" is input, the semantics of "search” and " ⁇ (program name)” are analyzed. Furthermore, if the program name is included in the analysis result, it is searched whether or not the program name is registered in the replacement dictionary 62. In the case of registration, the character data of the program name obtained as a result of the analysis is replaced with other character data designated in the replacement dictionary 62 and output.
- natural language analysis processing such as morpheme analysis and grammatical analysis on the text data (text constituted as a natural sentence) obtained in the text data communication circuit 51 as necessary to grasp the semantic content (operation content) of the text data ). For example, when text data such as "I want to watch ⁇ (program name
- the command output circuit 53 converts the analysis result in the language data analysis circuit 52 into a form that can be processed in the broadcast receiving device 2 and outputs it.
- the operation instruction signal is output to the broadcast receiving device 2 so that the "search" operation with " ⁇ (program name)" as the keyword is performed.
- the voice conversion circuit 56 converts the input text data into voice data.
- the converted voice data is output to the voice data transmission circuit 55.
- the voice data transmission circuit 55 transmits the voice data input from the voice conversion circuit 56 to a device connected to the Internet line 4 (for example, the voice recognition server 3).
- the replacement dictionary registration circuit 54 creates program name replacement data and registers it in the replacement dictionary 62.
- Fig. 4 is a diagram illustrating an example of program name replacement data.
- the program name replacement data creates one record for each program, and each record includes, for example, three items of "input program name", "replacement program name", and "registration date”.
- the "input program name” is an item corresponding to the program name (text data) obtained by converting the "pronunciation" of the "program name” in the new program information into text data in the voice recognition server 3.
- the "replacement program name” is set with other text data to be used in replacement with the text data registered in the "input program name” (specifically, the "program name” of the new program information is set).
- the "registration date” is set with the date when the record was registered in the replacement dictionary 62.
- the language data analysis circuit 52 refers to the substitution dictionary 62.
- FIG. 5 is a flowchart illustrating an example of a method of creating a replacement dictionary.
- the replacement dictionary registration circuit 54 extracts a program name that is a candidate for registration of the replacement dictionary from the new program information database 61 (S2). For example, the program that was registered in the new program information database 61 after the date when the replacement dictionary was created last is extracted. In the case where the date when the creation of the replacement dictionary was last implemented, for example, November 1, 2018, extract the "acquisition date” as the record after November 2, 2018, and put the extracted record in the "program name" The registered program is a candidate for registration in the alternative dictionary.
- each record in the new program information database 61 may be pre-set with a flag that can identify whether it has been extracted as a replacement dictionary registration candidate, and the "program name" of the record with the flag indicating that it has not been extracted may be registered.
- the program is registered as a candidate for replacement dictionary.
- the replacement dictionary registration circuit 54 converts the program name extracted in S2 into voice data and outputs it to the voice recognition server 3 (S3). Specifically, first, the replacement dictionary registration circuit 54 outputs the extracted program name (text data) to the speech conversion circuit 56. The voice conversion circuit 56 converts the input text data into voice data, and outputs it to the voice recognition server 3 via the voice data transmission circuit 55. In addition, when a plurality of program names are extracted in S2, one program name is selected from the plurality of program names, and the step of S3 described above is executed.
- the process of converting the extracted program name (text data) into voice data is not limited to the voice conversion circuit 56, and may be performed by another server or the like that can receive text data from the language processing server 1 and has a voice conversion function.
- the program name may be sent to the broadcast receiving device 2 from the alternative dictionary registration circuit 54 via the text data communication circuit 51 and the Internet line 4, and then the broadcast receiving device 2
- the voice conversion circuit converts the text data into voice data.
- the converted voice data is output from the broadcast receiving device 2 to the voice recognition server 3 via the Internet line 4.
- the text data communication circuit 51 of the language processing server 1 obtains the recognition result of the speech data output in S3, that is, the text data converted from the speech data from the speech recognition server 3 (S4).
- the character data communication circuit 51 outputs the obtained character data to the replacement dictionary registration circuit 54.
- the replacement dictionary registration circuit 54 compares the input character data with the program name (text data) output in S3 (S5). When the two do not match (S5, No), the compared character data is registered as a new record in the replacement dictionary 62 (S6). That is, a new record is registered in the replacement dictionary 62, in which the record sets the program name (text data) output to the voice recognition server 3 in S3 as "alternate program name", and will receive it from the voice recognition server in S4. 3 Set the acquired text data as "input program name”, and set the date on which registration is in progress as "registration date”.
- the voice retrieval of a specific program name is performed as follows.
- the user uses the remote controller 21 of the broadcast receiving apparatus 2 to voice input an intention to search for a specific program name.
- the user sends out "xiangyaokanbackstreetkids" ("xiangyaokanbackstreetkids" is the pronunciation of the instruction issued by the user) toward the microphone of the remote control 21.
- the broadcast receiving device 2 transmits voice data input from the user (for example, voice data “xiangyaokanbackstreetkids”) to the voice recognition server 3 via the Internet line 4.
- the voice recognition server 3 converts the input voice data into text data, and sends it to the language processing server 1 via the Internet line 4.
- the voice data "xiangyaokanbackstreetkids” is converted into text data "Want to see BACKSTREETKIDS” and sent (in the case of Japanese, it is converted into text data of a combination of Katakana and Kanji).
- the text data input from the voice recognition server 3 is output from the text data communication circuit 51 to the language data analysis circuit 52.
- the language data analysis circuit 52 performs natural language analysis processing on the input text data, grasps the semantic content of the text data, and generates an analysis result. For example, when the text data "Want to see BACKSTREETKIDS" is input, the following analysis result is generated: perform the operation of "search” with the program name "BACKSTREETKIDS” (in the case of Japanese, Katakana) as the keyword .
- the language data analysis circuit 52 searches whether the program name included in the analysis result has been registered in the "input program name" of the replacement dictionary 62. If it has been registered, replace the registered text data in the "replacement program name" of the record with the program name contained in the analysis result. Then, the analysis result after the program name is replaced is output to the command output circuit 53.
- the language data analysis circuit 52 refers to the replacement dictionary 62 and searches for "BACKSTREETKIDS" (Katakana in the case of Japanese). Whether the program name (text data) has been registered as "input program name”. If it has already been registered, the program name (text data) registered in the "replacement program name" of the record is extracted.
- the "input program name” is set to "BACKSTREETKIDS" (in the case of Japanese, it is Katakana) and "replacement program name”
- the record set to "BACK STREET KIDS” has been registered in the replacement dictionary 62. Therefore, the language data analysis circuit 52 refers to the replacement dictionary 62, and replaces the program name (character string) (Katakana in the case of Japanese) "BACKSTREETKIDS" as a result of the analysis with "BACKSTREETKIDS”. Then, the following analysis result is output to the instruction output circuit 53: the operation of "retrieving" with the keyword "BACK STREET KIDS" as the program name is performed.
- the command output circuit 53 converts the analysis result input from the language data analysis circuit 52 into a format that can be processed in the broadcast receiving device 2 and outputs it.
- the operation instruction signal is output to the broadcast receiving device 2 so as to "search” "BACK STREET KIDS" in the program name such as the program table (EPG) and recording data.
- the broadcast receiving device 2 extracts the program name marked "BACK STREET KIDS" from the stored program table (EPG) and the like, and displays it on the search result screen or the like.
- the text data output from the voice recognition server 3, namely "BACKSTREETKIDS” is used as it is, and the broadcast receiving device 2 is given a search instruction, it is because the mark ("BACK STREET KIDS") in the program table (EPG) Different, so there is a problem that the desired program cannot be retrieved.
- the description of the text data output from the speech recognition server 3 is different from the program name in the EPG, and the program name is replaced with the EPG mark, and the broadcast receiving device 2 Operation instructions, therefore, can reduce search omissions caused by inconsistencies in marks, and can improve search accuracy in voice search.
- the broadcast receiving device 2 performs a voice search of a program name has been described as an example, but it can also be applied to a voice search of a program provided by a distribution operator through a network such as Internet TV. .
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
本发明提供一种能够使利用语音检索进行的节目名检索的准确度提高的节目名检索辅助装置以及节目名检索辅助方法。具有:新节目信息数据库(61),其将广播接收装置(2)中存储的节目名的标记即第1节目名和节目名的读音成对地进行存储;文字数据通信电路(51),其取得第2节目名,该第2节目名是对于以新节目信息数据库(61)中存储的节目名的读音为基础的语音数据实施语音识别处理而得到的文字数据;以及替换辞典(62),其在第1节目名与第2节目名不同的情况下,将第1节目名和第2节目名成对地进行存储。
Description
本申请要求在2019年3月20日提交日本专利局、申请号为2019-053657、发明名称为“节目名检索辅助装置以及节目名检索辅助方法”的日本专利申请的优先权,其全部内容通过引用结合在本申请中。
本实施方式涉及节目名检索辅助装置以及节目名检索辅助方法。
在电视等AV设备中,通常利用从广播波取得的EPG(Electronic Program Guide:电子节目指南),进行广播节目的检索。在使用检索关键词来检索广播节目名的情况下,以往,操作遥控器的按钮来输入要检索的文字。
近年来,识别操作者的语音并基于识别结果来操作AV设备等的语音识别技术已实用化(例如,参照专利文献1)。例如,当操作者朝向搭载有麦克风的遥控器说出检索关键词时,能够基于语音识别结果来检索广播节目名。这样的语音检索与文字输入检索相比,省事而能够在短时间内进行检索,对操作者的便利性提高有较大的贡献。
但是,在语音检索中,存在如下问题:在作为语音识别结果得到的节目名(字符串)与从EPG取得的节目名不一致的情况下,无法进行节目检索。
在先技术文献
专利文献
专利文献1:日本特开2012-168349号公报
发明内容
本实施方式的目的在于提供一种能够使利用语音检索进行的节目名检索的准确度提高的、节目名检索辅助装置以及节目名检索辅助方法。
本实施方式的节目名检索辅助装置具有:节目信息存储部,其将节目名检索对象设备中存储的节目名的标记即第1节目名和所述节目名的读音成对地进行存储;文字数据取得电路,其对于以所述节目信息存储部中存储的所述节目名的读音为基础的语音数据,取得实施语音识别处理而得到的文字数据即第2节目名;以及替换辞典,在所述第1节目名与所述第2节目名不同的情况下,将所述第1节目名和所述第2节目名成对地进行存储。
图1是表示使用了本实施方式的节目名检索辅助装置的节目名检索系统的结构的一个例子的概略图;
图2是表示实施方式的节目名检索辅助装置的一个例子的语言处理服务器的结构的框图;
图3是说明新节目信息数据的一个例子的图;
图4是说明节目名替换数据的一个例子的图;
图5是说明替换辞典制作方法的一个例子的流程图。
附图标记说明
1…语言处理服务器、2…广播接收装置、3…语音识别服务器、4…互联网线路、21…遥控器、51…文字数据通信电路、52…语言数据解析电路、53…指令输出电路、54…替换辞典登记电路、55…语音数据发送电路、56…语音转换电路、61…新节目信息数据库、62…替换辞典。
以下,参照附图说明实施方式。
图1是表示使用了本实施方式的节目名检索辅助装置的节目名检索系统的结构的一个例子的概略图。节目名检索系统包括语言处理服务器1、作为节目名被检索装置的广播接收装置2、以及语音识别服务器3。语言处理服务器 1、广播接收装置2、以及语音识别服务器3经由互联网线路4而相互连接。
广播接收装置2接收由广播运营商通过在空间中传播的电波而提供的节目、由发布运营商通过线缆网、IP网等网络供给的节目等并播放。广播接收装置2经由遥控控制器(以下,表示为遥控器)21接收来自用户的操作指示。此外,广播接收装置2也可以是包含对接收到的节目进行记录的记录播放装置在内的结构。
遥控器21具备操作键(数字键、箭头键、颜色按钮等)、以及麦克风。例如,用户能够通过对作为操作键之一的麦克风按钮进行按压等针对遥控器21进行的规定的操作,从而经由遥控器21的麦克风向广播接收装置2发送语音数据。即,用户能够利用语音数据对广播接收装置2输入操作指示。
语音识别服务器3是提供基于云的语音识别服务的服务器。语音识别服务器3将从与互联网线路4连接的设备发送过来的语音数据转换成文字数据并输出。
作为节目名检索辅助装置的语言处理服务器1具有处理器11。语言处理服务器1对从广播接收装置2或语音识别服务器3输入的文字数据进行解析,提取对广播接收装置2的操作内容、参数等,转换成能够在广播接收装置2中处理的形式并输出。
图2是表示语言处理服务器1的结构的框图。语言处理服务器1具备文字数据通信电路51、语言数据解析电路52、指令输出电路53、替换辞典登记电路54、语音数据发送电路55、语音转换电路56这些电路。另外,语言处理服务器1还具备新节目信息数据库61以及替换辞典62这些存储装置。
文字数据通信电路51、语言数据解析电路52、指令输出电路53、替换辞典登记电路54、语音数据发送电路55、语音转换电路56这些电路的功能既可以由作为处理器11的CPU(Central Processing Unit:中央处理单元)通过软件来实现,也可以使用FPGA等通过硬件来实现。
作为文字数据取得电路的文字数据通信电路51控制与连接于互联网线路4的设备(例如,广播接收装置2、语音识别服务器3)之间的文字数据收发。 例如,取得从语音识别服务器3输出的文字数据、或者取得从广播电台等发送的电子节目表(EPG)信息。另外,文字数据通信电路51在从未图示的其它服务器取得了新节目信息的情况下,将该信息登记到新节目信息数据库61中。此外,所谓新节目信息,是在EPG信息中具有迄今未取得的新出现的节目名的节目的信息。
图3是说明新节目信息数据的一个例子的图。图3所示的新节目信息数据被登记到新节目信息数据库61中。如图3所示,新节目信息数据对每1个节目制作1个记录,各记录例如包括“节目名”、作为读法(节目名的读法的平假名记载)的“读音”、以及“(信息)取得日”这3个项目。对于“节目名”和“读音”,设定从文字数据通信电路51所取得的新节目信息中提取的数据。对于“取得日”,设定从其它服务器接收到该记录的新节目信息的日期。
语言数据解析电路52对于在文字数据通信电路51中取得的文字数据(构成为自然语句的文本),根据需要而执行词素解析、语法解析等自然语言解析处理,掌握文字数据的语义内容(操作内容)。例如,在输入了“想要看〇〇(节目名)”这种文字数据的情况下,被解析为“检索”“〇〇(节目名)”这种语义。此外,在解析结果中包含节目名的情况下,检索在替换辞典62中是否登记了该节目名。在已登记的情况下,将作为解析结果得到的该节目名的文字数据替换成在替换辞典62中指定的其它文字数据并输出。
指令输出电路53将语言数据解析电路52中的解析结果转换成在广播接收装置2中能够处理的形式并输出。例如,在上述的一个例子的情况下,对广播接收装置2输出操作指示信号,以使得执行以“〇〇(节目名)”作为关键词的“检索”操作。
语音转换电路56将被输入的文字数据转换成语音数据。转换后的语音数据向语音数据发送电路55输出。
语音数据发送电路55对连接于互联网线路4的设备(例如,语音识别服务器3)发送从语音转换电路56输入的语音数据。
替换辞典登记电路54制作节目名替换数据,并登记在替换辞典62中。 图4是说明节目名替换数据的一个例子的图。如图4所示,节目名替换数据对每1个节目制作1个记录,各记录例如包括“输入节目名”、“替换节目名”、以及“登记日”这3个项目。“输入节目名”是与将新节目信息中的“节目名”的“读音”在语音识别服务器3中转换成文字数据而得到的节目名(文字数据)对应的项目。“替换节目名”设定有与在“输入节目名”中登记的文字数据进行替换来使用的其它文字数据(具体而言,设定新节目信息的“节目名”)。“登记日”设定有该记录被登记在替换辞典62中的日期。此外,语言数据解析电路52参照替换辞典62。
接下来,使用图5说明语言处理服务器1中的替换辞典制作方法。图5是说明替换辞典制作方法的一个例子的流程图。
首先,在文字数据通信电路51中,从外部的服务器等取得新节目信息,将该信息登记到新节目信息数据库61中(S1)。接着,替换辞典登记电路54从新节目信息数据库61提取成为替换辞典登记候选的节目名(S2)。例如,提取在最后实施了替换辞典的制作的日期以后被登记到新节目信息数据库61中的节目。在最后实施了替换辞典的制作的日期例如为2018年11月1日的情况下,提取“取得日”为2018年11月2日以后的记录,将所提取出的记录的“节目名”中登记的节目作为替换辞典登记候选。
此外,替换辞典登记候选节目的提取方法不限定于上述的方法。例如,也可以对新节目信息数据库61的各记录预先设定有能够识别是否曾作为替换辞典登记候选被提取过的标识,将在具有表示未提取的标识的记录的“节目名”中登记的节目作为替换辞典登记候选。
接下来,替换辞典登记电路54将在S2中提取出的节目名转换成语音数据并向语音识别服务器3输出(S3)。具体而言,首先,替换辞典登记电路54将所提取出的节目名(文字数据)向语音转换电路56输出。语音转换电路56将被输入的文字数据转换成语音数据,经由语音数据发送电路55向语音识别服务器3输出。此外,在S2中提取了多个节目名的情况下,从多个节目名选择一个节目名,执行上述的S3的步骤。
此外,将提取出的节目名(文字数据)转换成语音数据的处理不限定于语音转换电路56,也可以通过能够从语言处理服务器1接收文字数据且具有语音转换功能的其它服务器等来进行。例如,在广播接收装置2具有语音转换电路的情况下,也可以从替换辞典登记电路54经由文字数据通信电路51、互联网线路4将该节目名发送到广播接收装置2,通过广播接收装置2的语音转换电路将该文字数据转换成语音数据。在此情况下,被转换后的语音数据经由互联网线路4从广播接收装置2向语音识别服务器3输出。
接着,语言处理服务器1的文字数据通信电路51从语音识别服务器3取得在S3中输出的语音数据的识别结果、即从语音数据转换后的文字数据(S4)。文字数据通信电路51将所取得的文字数据向替换辞典登记电路54输出。替换辞典登记电路54将被输入的文字数据、和在S3中输出的节目名(文字数据)进行比较(S5)。在两者不一致的情况下(S5,否),将做过比较的文字数据作为新记录登记到替换辞典62中(S6)。即,在替换辞典62中新登记一个记录,其中,该记录将在S3中向语音识别服务器3输出的节目名(文字数据)设定为“替换节目名”、将在S4中从语音识别服务器3取得的文字数据设定为“输入节目名”、将正在进行登记作业的日期设定为“登记日”。
例如,在S3中,向语音识别服务器3输出将“3年K组”这个标记的节目名(文字数据)进行了语音转换后的语音数据、即将“sanniankzu”这个读音进行了语音转换后的语音数据。在S4中输入了“三年K组”这个文字数据的情况下,两者不一致。因此,如图4所示的表的最上方的记录那样,在S6中将“3年K组”和“三年K组”的文字数据的一对登记到替换辞典62中。
另一方面,在S5中,在所输入的文字数据和在S3中输出的节目名(文字数据)一致的情况下,不进行向替换辞典62的登记,前进到S7。
在S2中提取出的替换辞典登记候选的节目名有多个的情况下,在对于所有的节目,执行完是否要向替换辞典62登记的判定的一系列的步骤(S3~S6)的情况下(S7,是),结束图5所示的语言处理服务器1中的替换辞典制作步骤。另一方面,在存在未执行是否向替换辞典62登记的判定的一系列的步 骤的节目名的情况下(S7,否),前进到S8,从未执行的节目名中提取一个节目名,设置为下一个判定对象节目名。对于所设置的节目名,执行从S3到S6的一个例子的步骤。
这样,通过预先将利用语音识别取得的节目名的标记与电子节目表(EPG)的节目名的标记不同者作为替换辞典来登记,从而在利用广播接收装置2进行了语音检索的情况下,能够使检索精度提高。
例如,在图1所示的节目名检索系统中,特定的节目名的语音检索如以下这样进行。首先,用户使用广播接收装置2的遥控器21将想要检索特定的节目名的意图进行语音输入。例如,在想要检索“BACK STREET KIDS”这个节目的情况下,用户朝向遥控器21的麦克风发出“xiangyaokanbackstreetkids”(“xiangyaokanbackstreetkids”是用户发出的指令的读音)。
广播接收装置2将从用户输入的语音数据(例如,“xiangyaokanbackstreetkids”这个语音数据)经由互联网线路4发送到语音识别服务器3。语音识别服务器3将被输入的语音数据转换成文字数据,经由互联网线路4发送到语言处理服务器1。例如,“xiangyaokanbackstreetkids”这个语音数据被转换成“想要看BACKSTREETKIDS”这个文字数据并发送(日文的情况下转换成片假名和日文汉字的组合的文字数据)。
从语音识别服务器3输入的文字数据被从文字数据通信电路51向语言数据解析电路52输出。语言数据解析电路52对于被输入的文字数据进行自然语言解析处理,掌握文字数据的语义内容,生成解析结果。例如,在输入了“想要看BACKSTREETKIDS”这个文字数据的情况下,生成如下的解析结果:执行以“BACKSTREETKIDS”这个节目名(日文的情况下是片假名)作为关键词“进行检索”这个操作。
语言数据解析电路52检索解析结果中包含的节目名是否已被登记在替换辞典62的“输入节目名”中。在已被登记的情况下,将该记录的“替换节目名”中已登记的文字数据替换为解析结果中包含的该节目名。然后,将节目名替换后的解析结果向指令输出电路53输出。
例如,在解析结果中存在“BACKSTREETKIDS”这个节目名(日文的情况下是片假名)的情况下,语言数据解析电路52参照替换辞典62,检索“BACKSTREETKIDS”(日文的情况下是片假名)这个节目名(文字数据)是否已作为“输入节目名”被登记。在已被登记的情况下,提取在该记录的“替换节目名”中已登记的节目名(文字数据)。
如上所述,在已经进行了以EPG的标记为基础的替换辞典登记作业的情况下,“输入节目名”被设定为“BACKSTREETKIDS”(日文的情况下是片假名)、“替换节目名”被设定为“BACK STREET KIDS”的记录已被登记在替换辞典62中。因此,语言数据解析电路52参照替换辞典62,将解析结果的“BACKSTREETKIDS”这个节目名(字符串)(日文的情况下是片假名)替换成“BACK STREET KIDS”。然后,对指令输出电路53输出如下的解析结果:执行以“BACK STREET KIDS”这个节目名为关键词“进行检索”这个操作。
指令输出电路53将从语言数据解析电路52输入的解析结果转换成在广播接收装置2中能够处理的形式并输出。在上述的一个例子的情况下,对广播接收装置2输出操作指示信号,以使得在节目表(EPG)、录像数据等节目名中“检索”“BACK STREET KIDS”。
广播接收装置2从所存储的节目表(EPG)等提取“BACK STREET KIDS”这个标记的节目名,显示在检索结果画面等。在将从语音识别服务器3输出的文字数据即“BACKSTREETKIDS”这个记载原样使用,对广播接收装置2进行了检索指示的情况下,因为与节目表(EPG)中的标记(“BACK STREET KIDS”)不同,所以存在无法检索期望的节目这种问题。对此,在本实施方式中,在语言处理服务器1中,对于从语音识别服务器3输出的文字数据的记载与EPG中的标记不同的节目名,替换成EPG的标记,对广播接收装置2进行操作指示,因此,能够减少标记的不一致所导致的检索遗漏,能够使语音检索中的检索精度提高。
此外,在上文中,对于广播接收装置2将节目名进行语音检索的情况作为一个例子而进行了说明,但是,对于由发布运营商通过互联网电视等网络 供给的节目等的语音检索,也能够适用。
说明了本发明的几个实施方式,但是,这些实施方式是作为一个例子而示出的,并非意图限定发明的范围。这些新的实施方式能够以其它的各种的形态来实施,在不脱离发明的主旨的范围内能够进行各种省略、替换、变更。这些实施方式、其变形包含于发明的范围、主旨,并且,包含于权利要求书所记载的技术方案及其等同的范围。
Claims (5)
- 一种节目名检索辅助装置,具有:节目信息存储部,其将节目名检索对象设备中存储的节目名的标记和所述节目名的读音成对地进行存储,其中,所述节目名检索对象设备中存储的节目名的标记为第1节目名;文字数据取得电路,其取得第2节目名,所述第2节目名是对于以所述节目信息存储部中存储的所述节目名的读音为基础的语音数据实施语音识别处理而得到的文字数据;以及替换辞典,在所述第1节目名和所述第2节目名不同的情况下,将所述第1节目名和所述第2节目名成对地进行存储。
- 根据权利要求1所述的节目名检索辅助装置,所述节目名检索对象设备中存储的所述节目名为电子节目表中所记载的节目名。
- 根据权利要求1或2所述的节目名检索辅助装置,所述节目名检索辅助装置还具有语言数据解析电路,所述语言数据解析电路执行自然语言解析处理,所述自然语言解析处理是对于为了操作所述节目名检索对象设备而输入的操作用语音数据实施语音识别处理而得到的文字数据的解析处理,在成为所述自然语言解析处理的对象的所述文字数据中存在所述第2节目名、且在所述替换辞典中存储有所述第2节目名的情况下,所述语言数据解析电路将所述第2节目名替换成在所述替换辞典中成对地存储的所述第1节目名,执行所述自然语言解析处理。
- 一种节目名检索辅助方法,将节目名检索对象设备中存储的节目名的标记和所述节目名的读音成对地取得,其中,所述节目名检索对象设备中存储的节目名的标记为第1节目名,取得对于以所述节目名的读音为基础的语音数据实施语音识别处理而得 到的文字数据即第2节目名,在所述第1节目名与所述第2节目名不同的情况下,将所述第1节目名和所述第2节目名成对地登记在替换辞典中。
- 根据权利要求4所述的节目名检索辅助方法,所述节目名检索对象设备中存储的所述节目名为电子节目表所记载的节目名。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202080002686.1A CN112243524B (zh) | 2019-03-20 | 2020-03-19 | 节目名检索辅助装置以及节目名检索辅助方法 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2019053657A JP7202938B2 (ja) | 2019-03-20 | 2019-03-20 | 番組名検索支援装置、及び、番組名検索支援方法 |
JP2019-053657 | 2019-03-20 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020187301A1 true WO2020187301A1 (zh) | 2020-09-24 |
Family
ID=72519567
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/080259 WO2020187301A1 (zh) | 2019-03-20 | 2020-03-19 | 节目名检索辅助装置以及节目名检索辅助方法 |
Country Status (3)
Country | Link |
---|---|
JP (1) | JP7202938B2 (zh) |
CN (1) | CN112243524B (zh) |
WO (1) | WO2020187301A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP7241142B1 (ja) | 2021-09-27 | 2023-03-16 | Tvs Regza株式会社 | 受信装置および選局システム |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001022374A (ja) * | 1999-07-05 | 2001-01-26 | Victor Co Of Japan Ltd | 電子番組ガイドの操作装置および電子番組ガイドの送信装置 |
CN1530926A (zh) * | 2003-03-13 | 2004-09-22 | ���µ�����ҵ��ʽ���� | 语音识别词典制作装置及信息检索装置 |
CN105225659A (zh) * | 2015-09-10 | 2016-01-06 | 中国航空无线电电子研究所 | 一种指令式语音控制发音词典辅助生成方法 |
CN108172223A (zh) * | 2017-12-14 | 2018-06-15 | 深圳市欧瑞博科技有限公司 | 语音指令识别方法、装置及服务器和计算机可读存储介质 |
Family Cites Families (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AT390685B (de) * | 1988-10-25 | 1990-06-11 | Philips Nv | System zur textverarbeitung |
JPH05143303A (ja) * | 1991-11-22 | 1993-06-11 | Kobe Nippon Denki Software Kk | プログラム情報の取り込み方式 |
DE4142812A1 (de) * | 1991-12-23 | 1993-06-24 | Heyo Dr Ing Habil Mennenga | Loesung zur grossflaechigen uebertragung von vorwiegend textnachrichten |
JPH11161640A (ja) * | 1997-11-27 | 1999-06-18 | Toshiba Corp | 中国語入力変換処理装置、中国語入力変換処理方法、中国語入力変換処理プログラムを記録した記録媒体 |
JP4550207B2 (ja) | 2000-02-29 | 2010-09-22 | クラリオン株式会社 | 音声認識装置および音声認識ナビゲーション装置 |
JP3639776B2 (ja) * | 2000-07-28 | 2005-04-20 | シャープ株式会社 | 音声認識用辞書作成装置および音声認識用辞書作成方法、音声認識装置、携帯端末器、並びに、プログラム記録媒体 |
JP2005227545A (ja) * | 2004-02-13 | 2005-08-25 | Matsushita Electric Ind Co Ltd | 辞書作成装置、番組案内装置及び辞書作成方法 |
CN1921606A (zh) * | 2005-08-22 | 2007-02-28 | 上海乐金广电电子有限公司 | 电子节目引导(epg)中的广播节目检索装置和方法 |
JP2007140194A (ja) * | 2005-11-18 | 2007-06-07 | Mitsubishi Electric Corp | 番組検索装置および形態素辞書管理サーバ |
JP2007178927A (ja) | 2005-12-28 | 2007-07-12 | Canon Inc | 情報検索装置および方法 |
JP4816409B2 (ja) | 2006-01-10 | 2011-11-16 | 日産自動車株式会社 | 認識辞書システムおよびその更新方法 |
JP2007257134A (ja) * | 2006-03-22 | 2007-10-04 | Mitsubishi Electric Corp | 音声検索装置、音声検索方法および音声検索プログラム |
JP5142769B2 (ja) * | 2008-03-11 | 2013-02-13 | 株式会社日立製作所 | 音声データ検索システム及び音声データの検索方法 |
JP2010072507A (ja) | 2008-09-22 | 2010-04-02 | Toshiba Corp | 音声認識検索装置及び音声認識検索方法 |
JP2010175708A (ja) * | 2009-01-28 | 2010-08-12 | Toshiba Corp | 音声認識検索システム及び音声認識検索方法 |
WO2015045039A1 (ja) | 2013-09-25 | 2015-04-02 | 株式会社東芝 | 方法、電子機器およびプログラム |
CN104519403A (zh) * | 2014-12-25 | 2015-04-15 | 西安诺瓦电子科技有限公司 | 一种音频控制装置及方法 |
US10289677B2 (en) * | 2015-02-19 | 2019-05-14 | Tribune Broadcasting Company, Llc | Systems and methods for using a program schedule to facilitate modifying closed-captioning text |
US10418026B2 (en) * | 2016-07-15 | 2019-09-17 | Comcast Cable Communications, Llc | Dynamic language and command recognition |
-
2019
- 2019-03-20 JP JP2019053657A patent/JP7202938B2/ja active Active
-
2020
- 2020-03-19 WO PCT/CN2020/080259 patent/WO2020187301A1/zh active Application Filing
- 2020-03-19 CN CN202080002686.1A patent/CN112243524B/zh active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001022374A (ja) * | 1999-07-05 | 2001-01-26 | Victor Co Of Japan Ltd | 電子番組ガイドの操作装置および電子番組ガイドの送信装置 |
CN1530926A (zh) * | 2003-03-13 | 2004-09-22 | ���µ�����ҵ��ʽ���� | 语音识别词典制作装置及信息检索装置 |
CN105225659A (zh) * | 2015-09-10 | 2016-01-06 | 中国航空无线电电子研究所 | 一种指令式语音控制发音词典辅助生成方法 |
CN108172223A (zh) * | 2017-12-14 | 2018-06-15 | 深圳市欧瑞博科技有限公司 | 语音指令识别方法、装置及服务器和计算机可读存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN112243524B (zh) | 2023-08-04 |
JP7202938B2 (ja) | 2023-01-12 |
CN112243524A (zh) | 2021-01-19 |
JP2020155976A (ja) | 2020-09-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11636146B2 (en) | Content analysis to enhance voice search | |
JP6603754B2 (ja) | 情報処理装置 | |
US8620658B2 (en) | Voice chat system, information processing apparatus, speech recognition method, keyword data electrode detection method, and program for speech recognition | |
WO2022042512A1 (zh) | 文本处理方法、装置、电子设备及介质 | |
WO2015146017A1 (ja) | 音声検索装置、音声検索方法、および表示装置 | |
US11494434B2 (en) | Systems and methods for managing voice queries using pronunciation information | |
CN103106287B (zh) | 一种用户检索语句的处理方法及系统 | |
JP5296598B2 (ja) | 音声情報抽出装置 | |
RU2009143360A (ru) | Способ, система и пользовательский интерфейс для автоматического создания атмосферы, в частности освещенной атмосферы, на основании ввода ключевого слова | |
KR20140051767A (ko) | 프로그램 추천 장치 및 프로그램 추천 프로그램 | |
KR20100067174A (ko) | 음성 인식을 이용한 메타데이터 검색기, 검색 방법, iptv 수신 장치 | |
US20150052169A1 (en) | Method, electronic device, and computer program product | |
JP6622165B2 (ja) | 対話ログ分析装置、対話ログ分析方法およびプログラム | |
CN201869323U (zh) | 一种机顶盒 | |
JPH11161661A (ja) | 情報検索装置 | |
JP2018045001A (ja) | 音声認識システム、情報処理装置、プログラム、音声認識方法 | |
WO2019123854A1 (ja) | 翻訳装置、翻訳方法、及びプログラム | |
WO2020187301A1 (zh) | 节目名检索辅助装置以及节目名检索辅助方法 | |
JP4848397B2 (ja) | 関連クエリ導出装置、関連クエリ導出方法及びプログラム | |
WO2016129188A1 (ja) | 音声認識処理装置、音声認識処理方法およびプログラム | |
JP2009163358A (ja) | 情報処理装置、情報処理方法、プログラムおよび音声チャットシステム | |
JP6433045B2 (ja) | キーワード抽出装置およびプログラム | |
JP2007199315A (ja) | コンテンツ提供装置 | |
CN117493606A (zh) | 一种视频检索方法、装置、系统、电子设备及存储介质 | |
JP2008225676A (ja) | 辞書検索装置及びその制御プログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20774846 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20774846 Country of ref document: EP Kind code of ref document: A1 |