[go: up one dir, main page]

CN113766165A - Interaction method, device, terminal and storage medium for realizing barrier-free video chat - Google Patents

Interaction method, device, terminal and storage medium for realizing barrier-free video chat Download PDF

Info

Publication number
CN113766165A
CN113766165A CN202110917966.1A CN202110917966A CN113766165A CN 113766165 A CN113766165 A CN 113766165A CN 202110917966 A CN202110917966 A CN 202110917966A CN 113766165 A CN113766165 A CN 113766165A
Authority
CN
China
Prior art keywords
video
chat
audio
client
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110917966.1A
Other languages
Chinese (zh)
Inventor
宋涛
柏超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Yiyu Intelligent Technology Co ltd
Original Assignee
Guangzhou Yiyu Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Yiyu Intelligent Technology Co ltd filed Critical Guangzhou Yiyu Intelligent Technology Co ltd
Priority to CN202110917966.1A priority Critical patent/CN113766165A/en
Publication of CN113766165A publication Critical patent/CN113766165A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/1813Arrangements for providing special services to substations for broadcast or conference, e.g. multicast for computer conferences, e.g. chat rooms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/1813Arrangements for providing special services to substations for broadcast or conference, e.g. multicast for computer conferences, e.g. chat rooms
    • H04L12/1822Conducting the conference, e.g. admission, detection, selection or grouping of participants, correlating users to one or more conference sessions, prioritising transmission

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

本发明公开了实现无障碍视频聊天的交互方式、装置、终端及存储介质,包括,客户端A会话建立,首先进入聊天室,接入聊天信令控制服务,聊天信令控制服务进行,鉴权,分配客户端标识,并分配音视频流播放地址,分配音视频流推送地址,分配文字聊天会话地址,通过音视频采集器采集本地音视频,在本地音视频播放器播放本地视频,视频流采集器开始向音视频流推送地址推流,文字聊天模块开始向文字聊天会话地址建立聊天;本发明专利通过将用户UI界面设计和人工智能技术的结合,实现了视频聊天时进行文字沟通功能和在聊天时自动将语音转成字幕功能,满足了无障碍沟通需求,同时也满足听障人士与健听人的无障碍视频聊天沟通需求。

Figure 202110917966

The invention discloses an interactive mode, a device, a terminal and a storage medium for realizing barrier-free video chat, including: client A establishes a session, first enters a chat room, accesses a chat signaling control service, performs the chat signaling control service, and authenticates , assign the client ID, assign the audio and video stream playback address, assign the audio and video stream push address, assign the text chat session address, collect the local audio and video through the audio and video collector, play the local video on the local audio and video player, and collect the video stream The device starts to push the address to the audio and video stream, and the text chat module starts to create a chat to the text chat session address; the patent of the present invention realizes the function of text communication during video chat by combining the user UI interface design and artificial intelligence technology. The function of automatically converting voice into subtitles when chatting meets the needs of barrier-free communication, as well as the barrier-free video chat communication needs of hearing-impaired people and hearing people.

Figure 202110917966

Description

Interactive mode, device, terminal and storage medium for realizing barrier-free video chat
Technical Field
The invention relates to the technical field of video chatting, in particular to an interaction mode, a device, a terminal and a storage medium for realizing barrier-free video chatting.
Background
The video chat is used as a real-time audio and video communication mode and widely applied to various scenes such as IM chat, video conference, video authentication of bank security fund insurance financial business and the like. The video chat is used as a communication mode for simultaneously presenting images and sounds of two parties, the shortage of information amount in telephone, IM voice and text chat is solved, and along with the development of smart phones, Internet of things and WiFi/4G/5G wireless communication technologies, the user threshold of the video chat is greatly reduced and is more and more common.
However, for the hearing-impaired people, the existing video chat products generally cannot meet the communication requirements of the hearing-impaired people and have two main problems, the general video chat products are not matched with a text communication function, and cannot perform supplementary text communication when the hearing-impaired people cannot hear the voice and cannot pronounce the voice during chatting, the general video chat products are not matched with an ASR function and a TTS function, and cannot automatically convert the voice into subtitles during chatting, so that the hearing-impaired people can understand the voice expression of the hearing-impaired people by reading the text conveniently, and the hearing-impaired people cannot input the text to enable the hearing-impaired people to understand the expression of the hearing-impaired people by converting the text into the voice.
The reasons for the problems are mainly that a hearing-impaired person is taken as a social disadvantaged group, various software lacks of optimization specially for the needs of the hearing-impaired person, two-video chatting is popularized gradually after 4G/5G network popularization and mobile internet highly develop, related applications are continuously perfected at present, text communication is added in three-video chatting, the scene is relatively numerous, the UI design of the existing video chatting interface design adding text chatting is difficult to balance business experience, and four-ASR and TTS technologies are also applied in large quantities after cloud computing, big data, machine learning and artificial intelligence technologies develop rapidly in recent years. The addition of ASR and TTS technologies in video chat also has a certain technical threshold, for example, audio stream separation, transcoding and then ASR are required, for example, text stream TTS is required to be converted into audio stream, and then encapsulated with video stream into media stream, which all require the addition of computational cost of audio/video and artificial intelligence.
The third and fourth mentioned above are also the main technical challenges encountered in the development and implementation of the present invention, and therefore, an interactive mode, an apparatus, a terminal and a storage medium for implementing barrier-free video chat are proposed to solve the problem.
Disclosure of Invention
The invention aims to provide an interaction mode, a device, a terminal and a storage medium for realizing barrier-free video chat, and solves the problem that the existing video chat product is not matched with a text communication function and cannot convert voice into subtitles during chat.
In order to achieve the above object, the first aspect of the present invention provides the following solutions: an interactive mode for realizing barrier-free video chat comprises the following steps:
establishing a session of a client A, firstly entering a chat room, accessing a chat signaling control service, performing the chat signaling control service, authenticating, allocating a client identifier, allocating an audio and video stream playing address, allocating an audio and video stream pushing address, allocating a text chat session address, acquiring local audio and video through an audio and video acquisition device, and playing local video on a local audio and video player;
the video stream collector starts to push the address to the audio and video stream;
the text chatting module starts to establish chatting to the text chatting session address;
the audio and video stream player starts to prepare for pulling stream from the audio and video stream playing address;
sharing video chat and inviting a client B to join;
establishing a session of a client B, entering a chat room by the client B, accessing a chat signaling control service, performing the chat signaling control service, authenticating, allocating a client identifier, allocating an audio and video stream playing address, allocating an audio and video stream pushing address, allocating a text chat session address, acquiring local audio and video through an audio and video collector, and playing local video on a local audio and video player;
the video stream collector starts to push the address to the audio and video stream;
the text chatting module starts to establish chatting to the text chatting session address;
the audio and video stream player starts to prepare for pulling stream from the audio and video stream playing address;
the method comprises the steps that ASR subtitle processing is added to a sound and video of a healthy person, audio and video streaming service is conducted, the audio of a client B is separated, ASR service is conducted, the audio and video are subjected to character conversion, the character conversion result is delivered to character chatting service for mixing processing, then the character conversion result is delivered to audio and video streaming service for mixing processing of video subtitles, the audio and video of the client B are mixed with video subtitles, then the audio and video streaming service is conducted, the audio and video + subtitle broadcasting service of the client B is improved for the client A, character chatting room service is conducted, the character conversion result of the client B is provided for the client A to be read by the client A, meanwhile, character chatting room service is conducted, the own character conversion result is provided for the client B, and the client B can read the video of a playing service end of a server audio and video streaming player;
the audio and video of the hearing-impaired person is added with TTS audio processing, a hearing-impaired user inputs characters in a character chatting module for communication, a character chatting session address is served to a character chatting room, and the character chatting content of a client A is pushed, the character chatting content of the client A is presented by the character chatting room service in a chatting window of the character chatting module, the character chatting room service shows the character conversation content of the two parties, the character chatting content of the client A is given to the TTS service for voice conversion, the TTS service carries out voice conversion on the characters, the TTS service gives a voice conversion result to the audio and video service for voice mixing processing, the audio and video service mixes the audio and video of the client A into TTS voice, the audio and video streaming service is provided for the client B, and the audio and video playing service of the client A is played by an audio and video streaming player at a service end;
keeping the chat session, taking the case that the client A is disconnected and reenters as an example, the client A accidentally drops out of the chat room, the client A reenters the chat room, intervenes in the chat signaling control service and the chat signaling control service, and gives the session ID to the client A to enable the client A to reenter the chat session;
ending the chat session, taking the client A actively ending the session as an example, notifying the chat signaling control service, ending the chat session, notifying the chat signaling control service, notifying the client B, ending the chat session, notifying the audio and video streaming service, notifying the text chat room service, and ending the chat session service.
In order to achieve the above object, the second aspect of the present invention provides the following solutions: realize accessible video chat device includes:
the barrier-free video chat device comprises a client system and a server system.
Preferably, the client system has two or more clients in a chat session.
Preferably, the client is implemented by two or more of software, APP, small program, webpage and H5.
Preferably, the basic module of the client system needs to include an audio/video stream player, an audio/video stream collector and a text chat module.
Preferably, the server system is deployed in a stand-alone manner or in a distributed manner, and the hardware of the server system is deployed in a local hardware server or a cloud server which has a public network IP or a domain name and is provided with a CPU operation unit, a memory processing unit and a hard disk storage unit.
Preferably, the basic module of the server system needs to include a chat signaling control service, an audio and video streaming service, a text chat room service, an ASR service and a TTS service.
Preferably, the operating system of the server system is Windows, Linux or Unix.
In order to achieve the above object, a third aspect of the present invention provides the following solutions: realize accessible video chat terminal includes:
the system terminal comprises a system memory and at least one processor, wherein instructions are stored in the memory, and the memory and the at least one processor are interconnected through a line;
the at least one processor invoking the instructions in the memory to cause the server to perform the steps of the unobstructed video chat interactive mode of claim 1.
In order to achieve the above object, a fourth aspect of the present invention provides the following solutions: implementing a barrier-free video chat-readable storage medium, comprising:
the computer readable storage medium has stored thereon instructions which, when executed by a processor, perform the steps of implementing an interactive mode of barrier-free video chat for hearing impaired people as claimed in claim 1.
Compared with the prior art, the invention has the beneficial effects that:
the invention realizes the function of text communication during video chat and the function of automatically converting voice into caption during chat by combining the UI interface design of the user with the artificial intelligence technology, meets the requirement of barrier-free communication and simultaneously meets the requirement of barrier-free video chat between the hearing impaired and the hearing-healthy people.
Drawings
FIG. 1 is a schematic diagram of the system of the present invention;
FIG. 2 is a first flowchart illustrating an interaction method according to the present invention;
FIG. 3 is a second flowchart illustrating an interaction method according to the present invention;
FIG. 4 is a third schematic flow chart of the interactive mode of the present invention;
fig. 5 is a fourth schematic flow chart of the interaction method of the present invention.
Detailed Description
The present invention will now be described in more detail by way of examples, which are given by way of illustration only and are not intended to limit the scope of the present invention in any way.
In a first aspect of the present invention, the present invention provides a technical solution: the interactive mode for realizing barrier-free video chat comprises the following steps:
firstly, a client A session is established, a chat room is entered, chat signaling control service is accessed, the chat signaling control service is carried out, authentication is carried out, client identification is allocated, an audio and video stream playing address is allocated, an audio and video stream pushing address is allocated, a text chat session address is allocated, local audio and video are collected through an audio and video collector, and local video is played on a local audio and video player;
the video stream collector starts to push the address to the audio and video stream;
the text chatting module starts to establish chatting to the text chatting session address;
the audiovisual stream player begins preparing to pull streams from the audiovisual stream playing address.
And secondly, sharing the video chat and inviting the client B to join.
Establishing a session of a client B, entering a chat room by the client B, accessing a chat signaling control service, performing the chat signaling control service, authenticating, allocating a client identifier, allocating an audio and video stream playing address, allocating an audio and video stream pushing address, allocating a text chat session address, acquiring local audio and video through an audio and video collector, and playing local video on a local audio and video player;
the video stream collector starts to push the address to the audio and video stream;
the text chatting module starts to establish chatting to the text chatting session address;
the audiovisual stream player begins preparing to pull streams from the audiovisual stream playing address.
And fourthly, adding ASR caption processing to the voice and video of the healthy listening person, performing audio and video streaming service, separating the audio of the client B, performing ASR service, performing character conversion on the audio and video, delivering a character conversion result to character chatting service for mixing processing, delivering the character conversion result to the audio and video streaming service for mixing processing of the video caption, performing audio and video streaming service, mixing the audio and video of the client B with the video caption, and then performing the audio and video streaming service, so that the audio and video broadcasting service and the caption broadcasting service of the client B are improved for the client A, and the character chatting room service is provided for the client A to provide the character conversion result of the client B for the client A to read, and simultaneously, the character chatting room service provides own character conversion result for the client B to read the video of a playing service end of a server audio and video streaming player.
And fifthly, adding TTS audio processing to the audio and video of the hearing-impaired person, inputting characters into a character chatting module by a hearing-impaired user for communication, serving a character chatting session address to a character chatting room, pushing character chatting contents of the client A, displaying character conversation contents of the client A and the character chatting room in a chatting window of the character chatting module, serving the character chatting room, transferring the character chatting contents of the client A to TTS service for voice conversion, serving the TTS service for voice conversion, transferring a voice conversion result to audio and video service for voice mixing processing by the TTS service, serving the audio and video TTS service, mixing the audio and video of the client A into voice, serving the client B with audio and video playing service of the client A, and playing the service video by a service-side audio and video stream player.
And sixthly, keeping the chat session, taking the case that the client A is disconnected and reenters as an example, the client A accidentally drops out of the chat room, the client A reenters the chat room, intervenes in the chat signaling control service and the chat signaling control service, and gives the session ID to the client A to enable the client A to reenter the chat session.
And seventhly, ending the chat session, taking the client A actively ending the session as an example, notifying the chat signaling control service, ending the chat session, notifying the client B, ending the chat session, notifying the audio and video streaming service, notifying the text chat room service, and ending the chat session service.
In a second aspect of the present invention, the present invention provides a technical solution: realize accessible video chat device includes:
the barrier-free video chat device comprises a client system and a server system.
The client system has two or more clients in one chat session.
The client is specifically implemented by two or more of software, APP, applet, webpage and H5.
The basic modules of the client system need to include an audio and video stream player, an audio and video stream collector and a text chat module.
And the audio and video stream player is used for playing the server side and the local video, playing the audio of the server side and superposing and presenting the subtitles when playing the audio and video of the server side.
And the audio and video stream collector is used for collecting local audio and video data by a camera and a microphone module of an operating system operated by the client, transmitting the collected audio and video data to the server for processing, and finally playing the data by one or more clients at the opposite end.
And the text chatting module is used for processing text communication contents in the chatting conversation, supporting the input of the text contents in the conversation and supporting the display of the chatting conversation between the local terminal and the opposite terminal.
The server system adopts single machine deployment or distributed deployment, and the hardware of the server system adopts a local hardware server or a cloud server which has a public network IP or a domain name and is provided with a CPU (Central processing Unit), a memory processing unit and a hard disk storage unit.
The basic modules of the server system need to comprise chat signaling control service, audio and video streaming service, text chat room service, ASR service and TTS service.
The chat signaling control service is used for controlling the chat process and providing a chat session establishment method for the client, providing an audio and video stream playing address for the client, providing an audio and video stream pushing address for the client, providing a character chat session address for the client, notifying each client when the chat session is closed, providing a maintenance recovery service for the chat session, providing the chat session address to which a character recognition result is sent for ASR service, and providing an audio and video stream service pushing address of a voice result for TTS service.
And the audio and video stream service is used for processing the audio and video stream media part in the chat process, providing the playing service of the audio and video stream for the client, providing the pushing service of the audio and video for the client, mixing the audio stream provided by the TTS service into the audio and video stream needing to be played by the client, separating the audio stream of the client, delivering the audio stream to the ASR service for identification and conversion, generating a caption from characters generated by the ASR service, and mixing the caption into the audio and video stream needing to be played by the client.
The text chat room service is used for processing the text chat part in the chat process, processing the receiving and distribution of text chat contents from the client, providing text recognition result receiving service for ASR service, mixing the text recognition results into the corresponding chat rooms, providing the text chat contents from the client to TTS service, and performing TTS processing.
And the ASR service is used for converting the voice part in the chatting process into characters, identifying the audio stream, converting the audio stream into characters, sending the character identification result to a character chatting room address provided by the chatting signaling control service, and sending the character identification result to the audio and video streaming service for mixing into subtitles.
TTS service, which is used to convert the text part into voice, and convert the text into voice, and send the voice conversion result to the audio/video stream service address provided by the chat signaling control service.
The operating system of the server system is Windows, Linux or Unix.
In a third aspect of the present invention, the present invention provides a technical solution: realize accessible video chat terminal includes:
the system terminal comprises a system memory and at least one processor, wherein instructions are stored in the memory, and the memory and the at least one processor are interconnected through a line;
the at least one processor invokes instructions in memory to cause the server to perform the steps of the unobstructed video chat interactive mode as in claim 1.
In a fourth aspect of the present invention, the present invention provides a technical solution: realize accessible video chat terminal includes:
the computer readable storage medium has stored thereon instructions which, when executed by the processor, perform the steps of implementing the barrier-free video chat for hearing impaired people as claimed in claim 1.
Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims (10)

1.实现无障碍视频聊天的交互方式,其特征在于:所述无障碍视频聊天的交互方式,包括:1. realize the interactive mode of barrier-free video chat, it is characterized in that: the interactive mode of described barrier-free video chat, comprises: 客户端A会话建立,首先进入聊天室,接入聊天信令控制服务,聊天信令控制服务进行,鉴权,分配客户端标识,并分配音视频流播放地址,分配音视频流推送地址,分配文字聊天会话地址,通过音视频采集器采集本地音视频,在本地音视频播放器播放本地视频;Client A session is established, first enters the chat room, accesses the chat signaling control service, the chat signaling control service is performed, authenticates, assigns the client identifier, assigns the audio and video stream playback address, assigns the audio and video stream push address, assigns The text chat session address, collect local audio and video through the audio and video collector, and play the local video on the local audio and video player; 视频流采集器开始向音视频流推送地址推流;The video stream collector starts to push the stream to the address of the audio and video stream; 文字聊天模块开始向文字聊天会话地址建立聊天;The text chat module starts to establish a chat to the text chat session address; 音视频流播放器开始准备从音视频流播放地址拉流;The audio and video stream player starts to prepare to pull the stream from the audio and video stream playback address; 分享视频聊天、邀请客户端B加入;Share video chat and invite client B to join; 客户端B会话建立,客户端B进入聊天室,接入聊天信令控制服务,聊天信令控制服务进行,鉴权,分配客户端标识,并分配音视频流播放地址,分配音视频流推送地址,分配文字聊天会话地址,通过音视频采集器采集本地音视频,在本地音视频播放器播放本地视频;Client B session is established, client B enters the chat room, accesses the chat signaling control service, the chat signaling control service is performed, authenticates, assigns the client ID, assigns the audio and video stream playback address, and assigns the audio and video stream push address , assign a text chat session address, collect local audio and video through the audio and video collector, and play the local video on the local audio and video player; 视频流采集器开始向音视频流推送地址推流;The video stream collector starts to push the stream to the address of the audio and video stream; 文字聊天模块开始向文字聊天会话地址建立聊天;The text chat module starts to establish a chat to the text chat session address; 音视频流播放器开始准备从音视频流播放地址拉流;The audio and video stream player starts to prepare to pull the stream from the audio and video stream playback address; 健听人音视频添加ASR字幕处理,音视频流服务,将客户端B的音频分离,ASR服务,对音视频进行文字转换,将文字转换结果交给文字聊天服务进行混入处理,然后将文字转换结果交给音视频流服务进行视频字幕混入处理,音视频流服务,再将客户端B的音视频混合视频字幕,随后音视频流服务,为客户端A提高客户端B的音视频+字幕播服务,文字聊天室服务,向客户端A提供客户端B的文字转换结果,供客户端A阅读,同时文字聊天室服务,向客户端B提供自己的文字转换结果,供客户端B阅读在服务端音视频流播放器的播放服务端视频;Audio and video add ASR subtitle processing, audio and video streaming service, separate the audio of client B, ASR service, perform text conversion on audio and video, send the text conversion result to the text chat service for mixing processing, and then convert the text The result is handed over to the audio and video streaming service for video subtitle mixing processing, the audio and video streaming service, and then the audio and video of client B is mixed with video subtitles, and then the audio and video streaming service improves the audio and video + subtitle broadcasting of client B for client A. Service, text chat room service, provides client A with the text conversion result of client B for client A to read, and text chat room service provides its own text conversion result to client B for client B to read in the service The playback server video of the audio and video stream player; 听障人音视频添加TTS音频处理,听障用户在文字聊天模块输入文字,进行沟通,向文字聊天室服务文字聊天会话地址,并推送客户端A的文字聊天内容,文字聊天室服务将客户端A的文字聊天内容,在文字聊天模块的聊天窗口,展示双方文字对话内容,文字聊天室服务,将客户端A的文字聊天内容,交给TTS服务进行语音转换,TTS服务,对文字进行语音转换,TTS服务将语音准换结果交给音视频服务进行声音混入处理,音视频服务,将客户端A音视频混入TTS声音,音视频流服务,为客户端B提供客户端A的音视频播放服务,在服务端音视频流播放器,播放服务端视频;TTS audio processing is added to the audio and video of the hearing-impaired. The hearing-impaired user enters text in the text chat module to communicate, sends the text chat session address to the text chat room service, and pushes the text chat content of client A, and the text chat room service sends the client The text chat content of A, in the chat window of the text chat module, the text conversation content of the two parties is displayed, the text chat room service, the text chat content of client A is handed over to the TTS service for voice conversion, and the TTS service is for text voice conversion. , the TTS service transfers the voice conversion result to the audio and video service for sound mixing processing, audio and video service, mixes the audio and video of client A into the TTS sound, audio and video streaming service, and provides the audio and video playback service of client A for client B , on the server-side audio and video stream player, play the server-side video; 聊天会话保持,以客户端A掉线,重新进入为例,客户端A意外掉出聊天室,客户端A重新进入聊天室,介入聊天信令控服务,聊天信令控制服务,将结束会话ID给客户端A,使客户端A重新进入聊天会话;The chat session is maintained. Take client A disconnected and re-entered as an example. Client A accidentally falls out of the chat room, and client A re-enters the chat room, intervenes in the chat signaling control service, and the chat signaling control service will end the session ID. To client A, so that client A re-enters the chat session; 结束聊天会话,以客户端A主动结束会话为例,通知聊天信令控制服务,结束聊天会话,聊天信令控制服务,通知客户端B,聊天会话结束聊天信令控制服务,通知音视频流服务、文字聊天室服务,结束聊天会话服务。End a chat session, take client A's initiative to end the session as an example, notify the chat signaling control service, end the chat session, chat signaling control service, notify client B, chat session end chat signaling control service, notify audio and video streaming service , text chat room service, end chat session service. 2.实现无障碍视频聊天装置,其特征在于:所述实现无障碍视频聊天装置,包括客户端系统和服务器系统。2. The device for realizing barrier-free video chat is characterized in that: the device for realizing barrier-free video chat includes a client system and a server system. 3.根据权利要求2所述的实现无障碍视频聊天装置,其特征在于:所述客户端系统,在一个聊天会话中,客户端有两个或多个。3 . The device for realizing barrier-free video chat according to claim 2 , wherein, in the client system, in a chat session, there are two or more clients. 4 . 4.根据权利要求3所述的实现无障碍视频聊天装置,其特征在于:所述客户端具体为软件、APP、小程序、网页和H5多种技术中的两个或多个实现。4 . The device for implementing barrier-free video chat according to claim 3 , wherein the client is implemented by two or more of software, APP, applet, web page and H5 multiple technologies. 5 . 5.根据权利要求3所述的实现无障碍视频聊天装置,其特征在于:所述客户端系统的基本模块需要包括音视频流播放器、音视频流采集器和文字聊天模块。5 . The device for realizing barrier-free video chat according to claim 3 , wherein the basic modules of the client system need to include an audio and video stream player, an audio and video stream collector and a text chat module. 6 . 6.根据权利要求2所述的实现无障碍视频聊天装置,其特征在于:所述服务器系统,服务器系统采用单机部署或分布式部署,其硬件采用部署在拥有公网IP或域名、具备CPU运算单元、内存处理单元、硬盘存储单元的本地硬件服务器或云服务器。6. The realization barrier-free video chatting device according to claim 2, is characterized in that: described server system, server system adopts stand-alone deployment or distributed deployment, and its hardware adopts and is deployed in possessing public network IP or domain name, possessing CPU computing unit, memory processing unit, local hardware server or cloud server for hard disk storage units. 7.根据权利要求6所述的实现无障碍视频聊天装置,其特征在于:所述服务器系统基本模块需要包括聊天信令控制服务、音视频流服务、文字聊天室服务、ASR服务和TTS服务。7 . The device for realizing barrier-free video chat according to claim 6 , wherein the basic module of the server system needs to include a chat signaling control service, an audio and video streaming service, a text chat room service, an ASR service and a TTS service. 8 . 8.根据权利要求7所述的实现无障碍视频聊天装置,其特征在于:所述服务器系统的操作系统为Windows、Linux或Unix。8. The device for realizing barrier-free video chat according to claim 7, wherein the operating system of the server system is Windows, Linux or Unix. 9.实现无障碍视频聊天终端,其特征在于:所述系统终端包括系统存储器和至少一个处理器,存储器中存储有指令,存储器和至少一个处理器通过线路互联;9. A barrier-free video chat terminal is realized, characterized in that: the system terminal comprises a system memory and at least one processor, and instructions are stored in the memory, and the memory and at least one processor are interconnected through a line; 所述至少一个处理器调用所述存储器中的指令,以使得服务器执行如权利要求1中的无障碍视频聊天的交互方式的步骤。The at least one processor invokes the instructions in the memory to cause the server to perform the steps of the interactive manner of the barrier-free video chat of claim 1 . 10.实现无障碍视频聊天可读存储介质,其特征在于:所述计算机可读存储介质上存储有指令,指令被处理器执行时实现如权利要求1中的实现听障人士无障碍视频聊天的交互方式的步骤。10. Realize barrier-free video chat readable storage medium, it is characterized in that: described computer-readable storage medium is stored with instruction, when instruction is executed by processor, realize the barrier-free video chat of hearing-impaired person as in claim 1. Interactive mode steps.
CN202110917966.1A 2021-08-11 2021-08-11 Interaction method, device, terminal and storage medium for realizing barrier-free video chat Pending CN113766165A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110917966.1A CN113766165A (en) 2021-08-11 2021-08-11 Interaction method, device, terminal and storage medium for realizing barrier-free video chat

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110917966.1A CN113766165A (en) 2021-08-11 2021-08-11 Interaction method, device, terminal and storage medium for realizing barrier-free video chat

Publications (1)

Publication Number Publication Date
CN113766165A true CN113766165A (en) 2021-12-07

Family

ID=78788943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110917966.1A Pending CN113766165A (en) 2021-08-11 2021-08-11 Interaction method, device, terminal and storage medium for realizing barrier-free video chat

Country Status (1)

Country Link
CN (1) CN113766165A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115767484A (en) * 2022-11-07 2023-03-07 中国联合网络通信集团有限公司 Call processing method, device, server, system and medium in customer service scene

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1972428A (en) * 2005-11-24 2007-05-30 中国电信股份有限公司 H.323-based video chat system and method
CN101453611A (en) * 2007-12-07 2009-06-10 希姆通信息技术(上海)有限公司 Method for video communication between the deaf and the normal
CN105376513A (en) * 2015-12-02 2016-03-02 小米科技有限责任公司 Information transmission method and device
CN106803918A (en) * 2017-03-02 2017-06-06 无锡纽微特科技有限公司 A kind of video call system and implementation method
CN111698446A (en) * 2020-05-26 2020-09-22 上海智勘科技有限公司 Method and system for simultaneously transmitting text information in real-time video

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1972428A (en) * 2005-11-24 2007-05-30 中国电信股份有限公司 H.323-based video chat system and method
CN101453611A (en) * 2007-12-07 2009-06-10 希姆通信息技术(上海)有限公司 Method for video communication between the deaf and the normal
CN105376513A (en) * 2015-12-02 2016-03-02 小米科技有限责任公司 Information transmission method and device
CN106803918A (en) * 2017-03-02 2017-06-06 无锡纽微特科技有限公司 A kind of video call system and implementation method
CN111698446A (en) * 2020-05-26 2020-09-22 上海智勘科技有限公司 Method and system for simultaneously transmitting text information in real-time video

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115767484A (en) * 2022-11-07 2023-03-07 中国联合网络通信集团有限公司 Call processing method, device, server, system and medium in customer service scene

Similar Documents

Publication Publication Date Title
US8270606B2 (en) Open architecture based domain dependent real time multi-lingual communication service
CN111935443B (en) Method and device for sharing instant messaging tool in real-time live broadcast of video conference
CN103716227B (en) A kind of method and apparatus for being used in instant messaging carry out information exchange
EP3902272A1 (en) Audio and video pushing method and audio and video stream pushing client based on webrtc protocol
CN110910860B (en) Online KTV implementation method and device, electronic equipment and storage medium
US20120017149A1 (en) Video whisper sessions during online collaborative computing sessions
WO2022089224A1 (en) Video communication method and apparatus, electronic device, computer readable storage medium, and computer program product
CN102984496B (en) The processing method of the audiovisual information in video conference, Apparatus and system
CN112866619B (en) A remote conference control method, device, electronic device and storage medium
CN108040061B (en) A method for live broadcast of cloud conference
CN105530535A (en) Method and system capable of realizing multi-person video watching and real-time interaction
CN103795964A (en) Video conferencing method and device thereof
CN114827518A (en) Projection video conference system
CN106789593B (en) A kind of instant message processing method, server and system merging sign language
CN104735480A (en) Information sending method and system between mobile terminal and television
CN103346953A (en) Method, device and system for group communication data interaction
CN114727120B (en) Live audio stream acquisition method and device, electronic equipment and storage medium
CN106162042A (en) A kind of method of video conference, server and terminal
CN113766165A (en) Interaction method, device, terminal and storage medium for realizing barrier-free video chat
CN102262344A (en) Projector capable of sharing images of slides played immediately
US11381628B1 (en) Browser-based video production
CN110276999A (en) A remote interactive teaching system and method with synchronous blackboard writing and live broadcast functions
CN112511847A (en) Method and device for superimposing real-time voice subtitles on video images
KR102546532B1 (en) Method for providing speech video and computing device for executing the method
CN115941881A (en) Video conference data transmission method and video conference system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20211207

RJ01 Rejection of invention patent application after publication