CN105306420B - Method, device and server for realizing cyclic playback of text-to-speech services - Google Patents
Method, device and server for realizing cyclic playback of text-to-speech services Download PDFInfo
- Publication number
- CN105306420B CN105306420B CN201410305490.6A CN201410305490A CN105306420B CN 105306420 B CN105306420 B CN 105306420B CN 201410305490 A CN201410305490 A CN 201410305490A CN 105306420 B CN105306420 B CN 105306420B
- Authority
- CN
- China
- Prior art keywords
- tts
- server
- text information
- media
- service
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 26
- 125000004122 cyclic group Chemical group 0.000 title claims abstract description 18
- 230000003993 interaction Effects 0.000 claims description 7
- 230000011664 signaling Effects 0.000 description 16
- 230000008569 process Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 238000012545 processing Methods 0.000 description 8
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000008570 general process Effects 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/40—Network security protocols
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Telephonic Communication Services (AREA)
Abstract
本发明实施例提供实现从文本到语音业务循环播放的方法、装置及服务器。方法包括:在TTS服务器利用所述媒体服务器的媒体通道,完成对文本信息的一次TTS服务时,判断所述TTS服务器完成对所述文本信息的TTS服务的次数是否达到所述文本信息的循环播放次数NUM,获取判断结果;当所述判断结果为否时,与所述TTS服务器交互,使得所述TTS服务器能够利用所述媒体通道,完成对所述文本信息的另一次TTS服务。本发明实施例提高了媒体服务器在进行TTS业务时的性能。
Embodiments of the present invention provide a method, device and server for realizing cyclic playback of text-to-speech services. The method includes: when the TTS server completes a TTS service to the text information by using the media channel of the media server, judging whether the number of times the TTS server completes the TTS service to the text information reaches the loop play of the text information The number of times NUM, to obtain the judgment result; when the judgment result is no, interact with the TTS server, so that the TTS server can use the media channel to complete another TTS service for the text information. The embodiment of the present invention improves the performance of the media server when performing the TTS service.
Description
技术领域technical field
本发明涉及通信领域,尤其涉及实现从文本到语音业务循环播放的方法、装置及服务器。The invention relates to the communication field, in particular to a method, a device and a server for realizing cyclic playback of text-to-speech services.
背景技术Background technique
媒体服务器(Media Server,简称MS)是软交换体系中提供专用媒体资源功能的独立设备,也是分组网络中的重要设备,提供基本和增强业务中的媒体处理功能,用于所有与音视频相关的媒体处理,包括视频和音频RTP数据流到视音频文件的相互转换。同时,也负责接收用户通过终端的DTMF输入、播放业务的引导语音、显示动态的引导画面。它具有的SIP协议和MSML/MOML能力使得其能在应用服务器APP的控制下完成整个会话过程与用户的交互。Media Server (Media Server, referred to as MS) is an independent device that provides dedicated media resource functions in the softswitch system, and is also an important device in the packet network. Media processing, including conversion of video and audio RTP data streams to video and audio files. At the same time, it is also responsible for receiving the user's DTMF input through the terminal, playing the guidance voice of the service, and displaying the dynamic guidance picture. Its SIP protocol and MSML/MOML capabilities enable it to complete the entire session process and interact with users under the control of the application server APP.
媒体控制单元(Media Service Control Unit,MSCU)是媒体服务器中的一个重要单元,主要完成与其他实体进行能力协商,提供资源本身的管理、维护以及控制其它业务资源单元完成复杂业务的功能。The Media Service Control Unit (MSCU) is an important unit in the media server. It mainly completes capability negotiation with other entities, provides management and maintenance of resources itself, and controls other service resource units to complete complex services.
媒体存储传输音频单元(Media Storage Transmit Unit-audio,下文简称MSTU),是媒体服务器中的业务资源单元,完成海量的音频数据存储,包括实现音频文件播放功能。媒体存储单元上有对外网口,可以直接通过单元上的对外网口收发。The Media Storage Transmit Unit-audio (hereinafter referred to as MSTU) is a business resource unit in the media server, which completes the storage of massive audio data, including the audio file playback function. There is an external network port on the media storage unit, which can directly send and receive through the external network port on the unit.
媒体处理单元(Media Resource Unit,MRU),主要完成媒体编解码转换、收号以及会议混音功能。The media processing unit (Media Resource Unit, MRU) mainly completes the functions of media codec conversion, number collection and conference audio mixing.
现在,媒体服务器的使用很广。主要可以归纳为音视频播放,收号和会议等功能。Today, media servers are widely used. It can be mainly summarized as audio and video playback, number collection and conference functions.
从文本到语音(Text To SPEECH,简称为TTS)的功能是将输入的文字信息识别出来,转化为音频,发送给用户。目前在电信领域,TTS的应用基本是配置一个专门的TTS服务器,通过信令指定TTS将音频发送到用户端来完成一次业务。The function of Text To Speech (TTS for short) is to recognize the input text information, convert it into audio, and send it to the user. At present, in the field of telecommunications, the application of TTS is basically to configure a special TTS server, and specify the TTS through signaling to send audio to the client to complete a service.
图1是根据相关技术实现TTS循环播放业务的系统结构示意图。如图1所示,该系统的工作流程包括如下步骤:Fig. 1 is a schematic structural diagram of a system for implementing a TTS loop play service according to related technologies. As shown in Figure 1, the workflow of the system includes the following steps:
步骤S101:终端发起一次呼叫,激活APP的业务。APP向媒体服务器发起业务流程;Step S101: The terminal initiates a call to activate the service of the APP. APP initiates a business process to the media server;
步骤S102:APP通过下发N次SIP信令,向媒体服务器请求完成N次TTS业务;Step S102: the APP requests the media server to complete N times of TTS services by sending N times of SIP signaling;
步骤S103:媒体服务器通过SIP信令向TTS服务器请求TTS资源,并通过MRCP协议控制TTS服务器完成业务功能;Step S103: the media server requests TTS resources from the TTS server through SIP signaling, and controls the TTS server to complete service functions through the MRCP protocol;
步骤S104:TTS服务器通过媒体服务器向终端发送媒体包,并且TTS服务器将识别播放时长等信息上报给媒体服务器。Step S104: the TTS server sends the media packet to the terminal through the media server, and the TTS server reports information such as identification of playing duration to the media server.
以上便是目前典型的组网和信令控制流程。TTS服务器作为媒体服务器的外围设备使用。APP在请求业务的时候只是向媒体服务器发起,媒体服务器判断业务类型,当业务类型为TTS应用时,媒体服务器再向TTS服务器发起请求,申请资源,并控制TTS服务器的行为,通过MRCP协议自动将文本识别成音频发送给媒体服务器。The above is the current typical networking and signaling control process. The TTS server is used as a peripheral device of the media server. When the APP requests a service, it only initiates to the media server, and the media server judges the service type. When the service type is TTS application, the media server initiates a request to the TTS server, applies for resources, and controls the behavior of the TTS server. The text is recognized as audio and sent to the media server.
以上流程能完成多个TTS业务,实现同文本识别循环播放。但考虑到媒体服务器每接受到一次INFO(TTS)业务都将会申请内部媒体资源(MSTU内、外口资源,MRU资源),完成同文本N次循环识别播放时,多个资源多次申请和释放,并且流程复杂,大大增加故障率,尤其是在大数据量呼叫时,会严重影响媒体服务器性能。The above process can complete multiple TTS services and realize the same text recognition loop playback. However, considering that the media server will apply for internal media resources (MSTU internal and external resources, MRU resources) every time it receives an INFO (TTS) service. Release, and the process is complicated, greatly increasing the failure rate, especially when calling with a large amount of data, it will seriously affect the performance of the media server.
发明内容Contents of the invention
有鉴于此,本发明实施例的目的是提供实现从文本到语音业务循环播放的方法、装置及服务器,以减少媒体服务器为支持TTS业务循环播放而对内部媒体资源进行处理的复杂度。In view of this, the purpose of the embodiments of the present invention is to provide a method, device and server for realizing cyclic playback of text-to-speech services, so as to reduce the complexity of the media server's processing of internal media resources in order to support TTS service cyclic playback.
为解决上述技术问题,本发明实施例提供方案如下:In order to solve the above technical problems, the embodiments of the present invention provide the following solutions:
本发明实施例提供一种实现从文本到语音TTS业务循环播放的方法,用于媒体服务器,包括:Embodiments of the present invention provide a method for realizing cyclic playback of TTS services from text to voice, for a media server, including:
在TTS服务器利用所述媒体服务器的媒体通道,完成对文本信息的一次TTS服务时,判断所述TTS服务器完成对所述文本信息的TTS服务的次数是否达到所述文本信息的循环播放次数NUM,获取判断结果;When the TTS server uses the media channel of the media server to complete a TTS service to the text information, it is judged whether the number of times the TTS server completes the TTS service to the text information reaches the number of times NUM of the text information is played repeatedly, Obtain the judgment result;
当所述判断结果为否时,与所述TTS服务器交互,使得所述TTS服务器能够利用所述媒体通道,完成对所述文本信息的另一次TTS服务。When the judgment result is no, interact with the TTS server, so that the TTS server can use the media channel to complete another TTS service for the text information.
优选地,所述判断所述TTS服务器完成对所述文本信息的TTS服务的次数是否达到所述文本信息的循环播放次数NUM之前,还包括:Preferably, the judging whether the number of times the TTS server completes the TTS service for the text information reaches the number NUM of loop playback times of the text information also includes:
接收应用服务器发送的针对所述文本信息的TTS服务请求消息,所述TTS服务请求消息中携带有所述循环播放次数NUM;Receiving a TTS service request message for the text information sent by the application server, where the TTS service request message carries the number of loop playback times NUM;
从所述TTS服务请求消息中解析出所述循环播放次数NUM。The loop playback times NUM is analyzed from the TTS service request message.
优选地,还包括:Preferably, it also includes:
在接收到应用服务器发送的针对所述文本信息的TTS服务请求消息时,打开所述媒体通道。When receiving the TTS service request message for the text information sent by the application server, open the media channel.
优选地,还包括:Preferably, it also includes:
当所述判断结果为是时,关闭所述媒体通道,并向APP服务器通知针对所述文本信息的所述NUM次循环播放完成。When the determination result is yes, the media channel is closed, and the APP server is notified that the NUM times of loop playing of the text information is completed.
优选地,所述媒体通道对应的编解码类型由所述媒体服务器根据所述媒体服务器支持的编解码类型集,与所述TTS服务器协商确定。Preferably, the codec type corresponding to the media channel is determined by the media server through negotiation with the TTS server according to the set of codec types supported by the media server.
本发明实施例还提供一种实现从文本到语音TTS业务循环播放的装置,用于媒体服务器,包括:The embodiment of the present invention also provides a device for realizing cyclic playback of TTS services from text to voice, which is used for a media server, including:
判断模块,用于在TTS服务器利用所述媒体服务器的媒体通道,完成对文本信息的一次TTS服务时,判断所述TTS服务器完成对所述文本信息的TTS服务的次数是否达到所述文本信息的循环播放次数NUM,获取判断结果;A judging module, configured to judge whether the number of times the TTS server completes the TTS service for the text information reaches the number of times for the text information when the TTS server completes a TTS service for the text information by using the media channel of the media server. Loop playback times NUM, get the judgment result;
交互模块,用于当所述判断结果为否时,与所述TTS服务器交互,使得所述TTS服务器能够利用所述媒体通道,完成对所述文本信息的另一次TTS服务。An interaction module, configured to interact with the TTS server when the judgment result is negative, so that the TTS server can use the media channel to complete another TTS service for the text information.
优选地,还包括:Preferably, it also includes:
接收模块,用于所述判断所述TTS服务器完成对所述文本信息的TTS服务的次数是否达到所述文本信息的循环播放次数NUM之前,接收应用服务器发送的针对所述文本信息的TTS服务请求消息,所述TTS服务请求消息中携带有所述循环播放次数NUM;The receiving module is used to receive the TTS service request for the text information sent by the application server before the number of times that the TTS server completes the TTS service for the text information reaches the number NUM of loop play of the text information message, the TTS service request message carries the number of times NUM of loop play;
解析模块,用于所述判断所述TTS服务器完成对所述文本信息的TTS服务的次数是否达到所述文本信息的循环播放次数NUM之前,从所述TTS服务请求消息中解析出所述循环播放次数NUM。The parsing module is used to determine whether the TTS server completes the TTS service for the text information before reaching the number NUM of loop playback of the text information, and parse the loop playback from the TTS service request message Number of times NUM.
优选地,还包括:Preferably, it also includes:
打开模块,用于在接收到应用服务器发送的针对所述文本信息的TTS服务请求消息时,打开所述媒体通道。The opening module is configured to open the media channel when receiving the TTS service request message for the text information sent by the application server.
优选地,还包括:Preferably, it also includes:
关闭及通知模块,用于当所述判断结果为是时,关闭所述媒体通道,并向APP服务器通知针对所述文本信息的所述NUM次循环播放完成。The closing and notification module is used to close the media channel when the judgment result is yes, and notify the APP server that the NUM times of loop playing of the text information is completed.
优选地,所述媒体通道对应的编解码类型由所述媒体服务器根据所述媒体服务器支持的编解码类型集,与所述TTS服务器协商确定。Preferably, the codec type corresponding to the media channel is determined by the media server through negotiation with the TTS server according to the set of codec types supported by the media server.
本发明实施例还提供一种包括以上所述的实现从文本到语音TTS业务循环播放的装置的服务器。The embodiment of the present invention also provides a server including the above-mentioned device for realizing the cyclic playback of the TTS service from text to voice.
从以上所述可以看出,本发明实施例至少具有如下有益效果:It can be seen from the above that the embodiments of the present invention have at least the following beneficial effects:
通过利用某次完成TTS服务所利用的媒体通道,来完成另一次TTS服务,从而避免了媒体通道的关闭及重新打开,从而减少了媒体服务器内部资源的建立和释放的次数和相应的信令交互,从而也就减轻了媒体服务器处理资源和信令的压力,提高了媒体服务器在进行TTS业务时的性能。Complete another TTS service by using the media channel used to complete a TTS service, thereby avoiding the closing and re-opening of the media channel, thereby reducing the number of establishment and release of internal resources of the media server and the corresponding signaling interaction , thereby reducing the pressure on the media server to process resources and signaling, and improving the performance of the media server when performing TTS services.
附图说明Description of drawings
图1表示根据相关技术的实现TTS循环播放业务一般流程结构示意图;Fig. 1 shows the schematic diagram of the general process structure of realizing the TTS loop play service according to related technologies;
图2表示本发明实施例提供的一种实现从文本到语音TTS业务循环播放的方法的步骤流程图;Fig. 2 represents a kind of flow chart of the step that realizes the method for cyclic playing from text to voice TTS service that the embodiment of the present invention provides;
图3表示本发明实施例的较佳实施方式的媒体服务器与各模块交互结构示意图;Fig. 3 shows the schematic diagram of the interactive structure of the media server and each module of the preferred implementation mode of the embodiment of the present invention;
图4本发明实施例的较佳实施方式的媒体服务器与各模块交换信令时序示意图;Fig. 4 is a schematic diagram of the signaling sequence exchange between the media server and each module in a preferred implementation mode of the embodiment of the present invention;
图5表示本发明实施例提供的一种实现从文本到语音TTS业务循环播放的装置的结构框图。Fig. 5 shows a structural block diagram of a device for implementing cyclic playback of text-to-speech TTS services provided by an embodiment of the present invention.
具体实施方式Detailed ways
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合附图及具体实施例对本发明实施例进行详细描述。In order to make the purpose, technical solutions and advantages of the embodiments of the present invention more clear, the embodiments of the present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.
图2表示本发明实施例提供的一种实现从文本到语音TTS业务循环播放的方法的步骤流程图,参照图2,本发明实施例提供一种实现从文本到语音TTS业务循环播放的方法,包括如下步骤:Fig. 2 shows a kind of flow chart of the steps of realizing the method for cyclic playback from text to voice TTS service provided by the embodiment of the present invention. Including the following steps:
步骤201,在TTS服务器利用所述媒体服务器的媒体通道,完成对文本信息的一次TTS服务时,判断所述TTS服务器完成对所述文本信息的TTS服务的次数是否达到所述文本信息的循环播放次数NUM,获取判断结果;Step 201, when the TTS server uses the media channel of the media server to complete a TTS service to the text information, judge whether the number of times the TTS server completes the TTS service to the text information reaches the loop play of the text information Number of times NUM, get the judgment result;
步骤202,当所述判断结果为否时,与所述TTS服务器交互,使得所述TTS服务器能够利用所述媒体通道,完成对所述文本信息的另一次TTS服务。Step 202, when the judgment result is no, interact with the TTS server, so that the TTS server can use the media channel to complete another TTS service for the text information.
所述方法用于媒体服务器。The method is for a media server.
可见,通过利用某次完成TTS服务所利用的媒体通道,来完成另一次TTS服务,从而避免了媒体通道的关闭及重新打开,从而减少了媒体服务器内部资源的建立和释放的次数和相应的信令交互,从而也就减轻了媒体服务器处理资源和信令的压力,提高了媒体服务器在进行TTS业务时的性能。It can be seen that by using the media channel used to complete the TTS service once to complete another TTS service, the closing and re-opening of the media channel is avoided, thereby reducing the number of establishment and release of internal resources of the media server and the corresponding information. Command interaction, thereby reducing the pressure on the media server to process resources and signaling, and improving the performance of the media server when performing TTS services.
本发明实施例中,所述判断所述TTS服务器完成对所述文本信息的TTS服务的次数是否达到所述文本信息的循环播放次数NUM之前,还可以包括:In the embodiment of the present invention, the judging whether the number of times the TTS server completes the TTS service for the text information reaches the number NUM of loop playback times of the text information may also include:
接收应用服务器发送的针对所述文本信息的TTS服务请求消息,所述TTS服务请求消息中携带有所述循环播放次数NUM;Receiving a TTS service request message for the text information sent by the application server, where the TTS service request message carries the number of loop playback times NUM;
从所述TTS服务请求消息中解析出所述循环播放次数NUM。The loop playback times NUM is analyzed from the TTS service request message.
进一步地,还可以包括:Further, it may also include:
在接收到应用服务器发送的针对所述文本信息的TTS服务请求消息时,打开所述媒体通道。When receiving the TTS service request message for the text information sent by the application server, open the media channel.
本发明实施例中,还可以包括:In the embodiment of the present invention, may also include:
当所述判断结果为是时,关闭所述媒体通道,并向APP服务器通知针对所述文本信息的所述NUM次循环播放完成。When the determination result is yes, the media channel is closed, and the APP server is notified that the NUM times of loop playing of the text information is completed.
本发明实施例中,所述媒体通道对应的编解码类型可以由所述媒体服务器根据所述媒体服务器支持的编解码类型集,与所述TTS服务器协商确定。In the embodiment of the present invention, the codec type corresponding to the media channel may be determined by the media server through negotiation with the TTS server according to the set of codec types supported by the media server.
为将本发明实施例阐述得更加清楚明白,下面提供本发明实施例的较佳实施方式。In order to illustrate the embodiments of the present invention more clearly, preferred implementation modes of the embodiments of the present invention are provided below.
针对现有技术中多次识别文本的TTS业务复杂处理流程,及对性能影响的弊端,本较佳实施方式实现提供一种实现从文本到语音TTS业务循环播放的方法、装置及系统,以解决现有技术中在媒体服务器处理TTS循环播放业务故障率高,性能低的问题。In view of the complex processing flow of the TTS business with multiple text recognition in the prior art, and the disadvantages of the impact on performance, this preferred implementation mode provides a method, device and system for realizing cyclic playback of TTS services from text to voice, so as to solve the problem of In the prior art, the media server handles the problems of high failure rate and low performance of the TTS loop playing service.
为了实现上述目的,提供一种实现TTS循环播放业务的方法,包括:In order to achieve the above purpose, a method for realizing TTS loop playback service is provided, including:
媒体服务器接收来自应用服务器APP的访问请求,并确定媒体服务器支持的编解码类型集;The media server receives the access request from the application server APP, and determines the set of codec types supported by the media server;
媒体服务器接收APP申请的TTS业务请求,并根据TTS业务类型向TTS服务器申请服务资源;The media server receives the TTS service request applied by the APP, and applies to the TTS server for service resources according to the TTS service type;
媒体服务器解析INFO(TTS)字段,以获取循环播放次数N,当进行完一次TTS业务时,媒体服务器不释放本端资源,保持与TTS服务器的媒体链接,再进行下一次MRCP协商识别文本信息,并且依据N来确定循环请求的次数,最终TTS服务器通过媒体服务(只需申请一次资源)将文本识别转换成的N次音频播放发送给终端。The media server parses the INFO (TTS) field to obtain the number N of loop playback times. When a TTS service is completed, the media server does not release the local resources, keeps the media link with the TTS server, and then conducts the next MRCP negotiation to identify the text information. And the number of loop requests is determined according to N, and finally the TTS server sends the N times of audio playback converted from the text recognition to the terminal through the media service (only needs to apply for a resource once).
为了实现上述目的,还提供一种实现从文本到语音TTS循环播放业务系统,实现装置包括:In order to achieve the above purpose, there is also provided a system for realizing the TTS loop playback service from text to voice, and the realization device includes:
第一处理模块,用于接收来自应用服务器APP的访问请求,并确定媒体服务器支持的编解码类型集;The first processing module is used to receive the access request from the application server APP, and determine the codec type set supported by the media server;
第二处理模块,用于接收APP申请的TTS业务请求,并根据TTS业务类型申请TTS服务资源,确定循环播放次数;The second processing module is used to receive the TTS service request applied by the APP, and apply for TTS service resources according to the TTS service type, and determine the number of loop playback times;
第三处理模块,用于根据编解码类型集与TTS服务器进行协商,以获取协商后的音频编解码类型,并按照音频编解码类型通过媒体服务器将媒体业务数据包发送至终端服务器。The third processing module is configured to negotiate with the TTS server according to the codec type set to obtain the negotiated audio codec type, and send the media service data packet to the terminal server through the media server according to the audio codec type.
图3是根据本较佳实施方式,媒体服务器内部各模块与APP、TTS及终端服务器交换结构示意图,如图3所示,媒体控制单元MSCU,用于发送会话初始协议SIP、MRCP信令至TTS服务器,SIP协商以协商并指定媒体服务器与TTS服务器匹配的音频编解码类型,MRCP信令交互以控制TTS服务器识别文本,播放内容;语音中心交换单元MRU,用于接收TTS服务器数据包,并将媒体业务数据包发送至媒体存储传输音频单元MSTU;其中,MSCU控制MSTU将媒体业务数据包发送至终端。Fig. 3 is according to this preferred implementation mode, each module inside the media server and APP, TTS and terminal server exchange structural diagram, as shown in Fig. 3, media control unit MSCU, is used for sending session initiation protocol SIP, MRCP signaling to TTS Server, SIP negotiation to negotiate and specify the audio codec type that the media server matches with the TTS server, MRCP signaling interaction to control the TTS server to recognize text and play content; the voice center switching unit MRU is used to receive TTS server data packets, and The media service data packet is sent to the media storage transmission audio unit MSTU; wherein, the MSCU controls the MSTU to send the media service data packet to the terminal.
图4是媒体服务器与各模块交换时序示意图,详细工作流程如下:Figure 4 is a schematic diagram of the exchange sequence between the media server and each module, and the detailed workflow is as follows:
步骤S410、S420APP向媒体服务器发送INVITE信令进行媒体协商,媒体服务器通过自身的能力集选定编解码类型,并将MSTU外口地址作为与终端交互的地址;APP服务器向媒体服务器发送INFO请求,INFO中的内容为申请TTS业务,媒体服务器解析INFO中循环播放的字段为N,并且保存所有信息;Steps S410 and S420APP sends INVITE signaling to the media server for media negotiation, the media server selects the codec type through its own capability set, and uses the MSTU external port address as the address for interacting with the terminal; the APP server sends an INFO request to the media server, The content in INFO is to apply for TTS service, the media server parses the field of loop playback in INFO as N, and saves all information;
步骤S430,媒体服务器与TTS服务器进行协商,并控制TTS服务器进行文本转换成语音。如图4所示,该步骤S430具体可以包括如下步骤:In step S430, the media server negotiates with the TTS server, and controls the TTS server to convert text into speech. As shown in FIG. 4, the step S430 may specifically include the following steps:
步骤S4301,媒体控制单元MSCU向TTS服务器发起会话初始协议SIP信令,协商编解码类型。此时在INVITE信令中协商的音频编解码能力集为媒体服务器所拥有的,即为MRU支持的所有编解码类型;In step S4301, the media control unit MSCU initiates SIP signaling to the TTS server to negotiate codec types. At this time, the audio codec capability set negotiated in the INVITE signaling is owned by the media server, that is, all codec types supported by the MRU;
步骤S4302,TTS服务器回复INVITE消息200OK,将协商好的音频编解码类型通知媒体服务器;Step S4302, the TTS server replies to the INVITE message 200 OK, and notifies the media server of the negotiated audio codec type;
步骤S4303,MSCU申请TTS业务所需的媒体服务器侧MSTU外口资源,MRU1转码资源、MRU2转码资源;MSCU向MSTU下发打开NAT通道的命令,向MRU下发打开转码命令,指示将从MRU内口接受数据到TTS服务器发送过来的识别音频包,媒体服务器侧媒体通道打开;Step S4303, the MSCU applies for the MSTU external port resources on the media server side required by the TTS service, the MRU1 transcoding resource, and the MRU2 transcoding resource; Receive data from the MRU internal port to the identification audio packet sent by the TTS server, and open the media channel on the media server side;
步骤S4304,MSCU向TTS服务器下发建立TCP/IP链接请求;Step S4304, the MSCU sends a TCP/IP link establishment request to the TTS server;
步骤S4305,MSCU接受TTS服务器下发建立TCP/IP链接请求回复消息;In step S4305, the MSCU accepts the reply message of establishing a TCP/IP connection request issued by the TTS server;
步骤S4306,MSCU向TTS服务器下发MRCP请求消息,指示TTS服务器需要识别的文本信息;Step S4306, the MSCU sends an MRCP request message to the TTS server, indicating the text information that the TTS server needs to identify;
步骤S4307,TTS服务器回复MRCP请求消息,通知媒体服务器正在进行文本识别,并向MSTU外口发送音频包,MRU将从MSTU外口经NAT转发过来的数据包发到终端;Step S4307, the TTS server replies to the MRCP request message, notifies the media server that text recognition is in progress, and sends an audio packet to the MSTU external port, and the MRU sends the data packet forwarded from the MSTU external port via NAT to the terminal;
步骤S4308,TTS服务器通知MSCU本次文本识别完成,MSCU通知TTS服务器关闭本次TCP链路,同时MSCU模块解析保存的INFO(TTS)循环播放次数N,判断是否需要再次向TTS服务器发起MRCP请求,如需继续识别播放,重复步骤S4304~步骤S4308;Step S4308, the TTS server notifies the MSCU that this text recognition is completed, and the MSCU notifies the TTS server to close this TCP link, and the MSCU module parses the saved INFO (TTS) loop times N to judge whether it is necessary to initiate an MRCP request to the TTS server again, If you need to continue to identify and play, repeat steps S4304 to S4308;
步骤S4309,N次TTS业务完成,MSCU向TTS服务器下发bye请求消息,通知TTS服务器释放本次TTS业务对应的SIP数据区;Step S4309, when N times of TTS services are completed, the MSCU sends a bye request message to the TTS server, notifying the TTS server to release the SIP data area corresponding to this TTS service;
步骤S4310,媒体服务器收到TTS服务器bye回复消息,释放媒体服务器侧SIP数据区,业务完成。Step S4310, the media server receives the reply message of bye from the TTS server, releases the SIP data area on the media server side, and the service is completed.
步骤S440,媒体服务器向APP发送info消息,上报播放时长等信息;Step S440, the media server sends an info message to the APP, reporting information such as playing duration;
步骤S450,APP向媒体服务器发送BYE信令,释放资源。In step S450, the APP sends a BYE signaling to the media server to release resources.
上述实施例中,媒体服务器通过和TTS服务器一次协商结果,完成N次MRCP请求,识别文本处理,明显减轻了媒体服务器处理资源和信令的压力,大大提高了媒体服务器在进行TTS业务时的性能。In the above-mentioned embodiment, the media server completes N times of MRCP requests and recognizes text processing through one negotiation result with the TTS server, which obviously reduces the pressure on the media server to process resources and signaling, and greatly improves the performance of the media server when performing TTS services .
图5表示本发明实施例提供的一种实现从文本到语音TTS业务循环播放的装置的结构框图,参照图5,本发明实施例提供一种实现从文本到语音TTS业务循环播放的装置,包括:Fig. 5 shows a structural block diagram of a device for realizing cyclic playback of TTS services from text to voice provided by an embodiment of the present invention. With reference to Fig. 5, an embodiment of the present invention provides a device for realizing cyclic playback of TTS services from text to voice, including :
判断模块501,用于在TTS服务器利用所述媒体服务器的媒体通道,完成对文本信息的一次TTS服务时,判断所述TTS服务器完成对所述文本信息的TTS服务的次数是否达到所述文本信息的循环播放次数NUM,获取判断结果;Judging module 501, configured to judge whether the number of TTS services completed by the TTS server for the text information reaches the text information when the TTS server completes a TTS service for the text information by using the media channel of the media server. The loop playback times NUM, get the judgment result;
交互模块502,用于当所述判断结果为否时,与所述TTS服务器交互,使得所述TTS服务器能够利用所述媒体通道,完成对所述文本信息的另一次TTS服务。The interaction module 502 is configured to interact with the TTS server when the judgment result is no, so that the TTS server can use the media channel to complete another TTS service for the text information.
所述装置用于媒体服务器。The device is used for a media server.
可见,通过利用某次完成TTS服务所利用的媒体通道,来完成另一次TTS服务,从而避免了媒体通道的关闭及重新打开,从而减少了媒体服务器内部资源的建立和释放的次数和相应的信令交互,从而也就减轻了媒体服务器处理资源和信令的压力,提高了媒体服务器在进行TTS业务时的性能。It can be seen that by using the media channel used to complete the TTS service once to complete another TTS service, the closing and re-opening of the media channel is avoided, thereby reducing the number of establishment and release of internal resources of the media server and the corresponding information. Command interaction, thereby reducing the pressure on the media server to process resources and signaling, and improving the performance of the media server when performing TTS services.
本发明实施例中,还可以包括:In the embodiment of the present invention, may also include:
接收模块,用于所述判断所述TTS服务器完成对所述文本信息的TTS服务的次数是否达到所述文本信息的循环播放次数NUM之前,接收应用服务器发送的针对所述文本信息的TTS服务请求消息,所述TTS服务请求消息中携带有所述循环播放次数NUM;The receiving module is used to receive the TTS service request for the text information sent by the application server before the number of times that the TTS server completes the TTS service for the text information reaches the number NUM of loop play of the text information message, the TTS service request message carries the number of times NUM of loop play;
解析模块,用于所述判断所述TTS服务器完成对所述文本信息的TTS服务的次数是否达到所述文本信息的循环播放次数NUM之前,从所述TTS服务请求消息中解析出所述循环播放次数NUM。The parsing module is used to determine whether the TTS server completes the TTS service for the text information before reaching the number NUM of loop playback of the text information, and parse the loop playback from the TTS service request message Number of times NUM.
本发明实施例中,还可以包括:In the embodiment of the present invention, may also include:
打开模块,用于在接收到应用服务器发送的针对所述文本信息的TTS服务请求消息时,打开所述媒体通道。The opening module is configured to open the media channel when receiving the TTS service request message for the text information sent by the application server.
本发明实施例中,还可以包括:In the embodiment of the present invention, may also include:
关闭及通知模块,用于当所述判断结果为是时,关闭所述媒体通道,并向APP服务器通知针对所述文本信息的所述NUM次循环播放完成。The closing and notification module is used to close the media channel when the judgment result is yes, and notify the APP server that the NUM times of loop playing of the text information is completed.
本发明实施例中,所述媒体通道对应的编解码类型可以由所述媒体服务器根据所述媒体服务器支持的编解码类型集,与所述TTS服务器协商确定。In the embodiment of the present invention, the codec type corresponding to the media channel may be determined by the media server through negotiation with the TTS server according to the set of codec types supported by the media server.
本发明实施例还提供一种服务器,所述服务器包括以上所述的实现从文本到语音TTS业务循环播放的装置。所述服务器例如:媒体服务器。An embodiment of the present invention also provides a server, which includes the above-mentioned device for realizing the cyclic playback of the TTS service from text to voice. The server is for example: a media server.
以上所述仅是本发明实施例的实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明实施例原理的前提下,还可以作出若干改进和润饰,这些改进和润饰也应视为本发明实施例的保护范围。The above is only the implementation of the embodiment of the present invention. It should be pointed out that for those skilled in the art, without departing from the principle of the embodiment of the present invention, some improvements and modifications can also be made. These improvements and Retouching should also be regarded as the scope of protection of the embodiments of the present invention.
Claims (11)
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410305490.6A CN105306420B (en) | 2014-06-27 | 2014-06-27 | Method, device and server for realizing cyclic playback of text-to-speech services |
PCT/CN2015/073051 WO2015196823A1 (en) | 2014-06-27 | 2015-02-13 | Method and device for achieving cyclic playing from text to voice service, and server |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410305490.6A CN105306420B (en) | 2014-06-27 | 2014-06-27 | Method, device and server for realizing cyclic playback of text-to-speech services |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105306420A CN105306420A (en) | 2016-02-03 |
CN105306420B true CN105306420B (en) | 2019-08-30 |
Family
ID=54936711
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410305490.6A Active CN105306420B (en) | 2014-06-27 | 2014-06-27 | Method, device and server for realizing cyclic playback of text-to-speech services |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN105306420B (en) |
WO (1) | WO2015196823A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107357727A (en) * | 2017-07-04 | 2017-11-17 | 广州君海网络科技有限公司 | APP testing results method, apparatus, readable storage medium storing program for executing and computer equipment |
CN111369970A (en) * | 2020-06-01 | 2020-07-03 | 浙江百应科技有限公司 | Method for intelligent routing of TTS (text to speech) channel with high availability |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI247219B (en) * | 2002-09-13 | 2006-01-11 | Ind Tech Res Inst | Efficient and scalable methods for text script generation in corpus-based tts desing |
US20060136212A1 (en) * | 2004-12-22 | 2006-06-22 | Motorola, Inc. | Method and apparatus for improving text-to-speech performance |
CN100486282C (en) * | 2006-03-27 | 2009-05-06 | 华为技术有限公司 | Method for realizing interactive voice |
CN101378391B (en) * | 2007-08-31 | 2011-12-21 | 华为技术有限公司 | Method and communication system for implementing medium service as well as relevant equipment |
CN201199724Y (en) * | 2008-02-05 | 2009-02-25 | 珠海市太川电子企业有限公司 | Indoor set for visual intercommunication system |
JP2009294640A (en) * | 2008-05-07 | 2009-12-17 | Seiko Epson Corp | Voice data creation system, program, semiconductor integrated circuit device, and method for producing semiconductor integrated circuit device |
CN102314874A (en) * | 2010-06-29 | 2012-01-11 | 鸿富锦精密工业(深圳)有限公司 | Text-to-voice conversion system and method |
CN102231734B (en) * | 2011-06-22 | 2017-10-03 | 南京中兴新软件有限责任公司 | Realize audio code-transferring method, the apparatus and system from Text To Speech TTS |
CN102394991B (en) * | 2011-09-28 | 2017-04-19 | 中兴通讯股份有限公司 | Method and system for realizing sound playing for assembly room in multimedia meeting business |
-
2014
- 2014-06-27 CN CN201410305490.6A patent/CN105306420B/en active Active
-
2015
- 2015-02-13 WO PCT/CN2015/073051 patent/WO2015196823A1/en active Application Filing
Also Published As
Publication number | Publication date |
---|---|
WO2015196823A1 (en) | 2015-12-30 |
CN105306420A (en) | 2016-02-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2005232263B2 (en) | Method and apparatus for signaling VoIP call based on class of service in VoIP service system | |
EP2779579B1 (en) | Method and apparatuses for realizing voip call in cloud computing environment | |
US20100082824A1 (en) | Program network recording method, media processing server and network recording system | |
US10469542B2 (en) | Communications methods, apparatus and systems to provide optimal media routing | |
EP2584760A1 (en) | Method for realizing video browsing, ip multimedia subsystem (ims) video monitoring system, and monitoring front end | |
WO2009082301A1 (en) | A method and an apparatus for handling multimedia calls | |
KR20110055378A (en) | Communication methods, computer readable media and communication devices | |
US9888083B2 (en) | Transcription of communication sessions | |
EP1883198B1 (en) | Method and system for interacting with media servers based on the sip protocol | |
CN103151041B (en) | A kind of implementation method of automatic speech recognition business, system and media server | |
CN101547266B (en) | Method and system for providing intelligent service and gateway | |
JP5235633B2 (en) | Mechanism for dynamically bypassing communication with characteristics not compatible with communication devices to another device | |
CN105306420B (en) | Method, device and server for realizing cyclic playback of text-to-speech services | |
CN113726968B (en) | Terminal communication method, device, server and storage medium | |
CN102111415A (en) | Interactive network voice response system with embedded VoIP and implementation method thereof | |
CN102045330B (en) | IMS soft terminal and communication method thereof | |
CN102231734A (en) | Method, device and system for realizing audio transcoding of TTS (Text To Speech) | |
CN101453446B (en) | A method, device and system for establishing MRCP control and bearer channels | |
CN101472019B (en) | Method, system and device for mutually communicating outband DTMF signaling | |
US8582559B2 (en) | System and method for handling media streams | |
KR20120058764A (en) | Method and apparatus for providing voice quality in voice over internet protocol | |
Karaagac et al. | Viot: voice over internet of things | |
WO2011120367A1 (en) | Method and device for analyzing voice quality | |
WO2009030171A1 (en) | Media service implementing method and communication system and associated devices | |
KR20090066062A (en) | SIP-based Internet telephone service system and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |