[go: up one dir, main page]

CN113556313B - Real-time intercom intervention and alarm platform based on AI technology - Google Patents

Real-time intercom intervention and alarm platform based on AI technology Download PDF

Info

Publication number
CN113556313B
CN113556313B CN202110108101.0A CN202110108101A CN113556313B CN 113556313 B CN113556313 B CN 113556313B CN 202110108101 A CN202110108101 A CN 202110108101A CN 113556313 B CN113556313 B CN 113556313B
Authority
CN
China
Prior art keywords
module
intervention
alarm information
training
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110108101.0A
Other languages
Chinese (zh)
Other versions
CN113556313A (en
Inventor
谢建华
张伟雄
戴东旭
蔡存忠
陈秋林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Huanyutong Technology Co ltd
Original Assignee
Fujian Huanyutong Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Huanyutong Technology Co ltd filed Critical Fujian Huanyutong Technology Co ltd
Priority to CN202110108101.0A priority Critical patent/CN113556313B/en
Publication of CN113556313A publication Critical patent/CN113556313A/en
Application granted granted Critical
Publication of CN113556313B publication Critical patent/CN113556313B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • GPHYSICS
    • G08SIGNALLING
    • G08BSIGNALLING OR CALLING SYSTEMS; ORDER TELEGRAPHS; ALARM SYSTEMS
    • G08B3/00Audible signalling systems; Audible personal calling systems
    • G08B3/10Audible signalling systems; Audible personal calling systems using electric transmission; using electromagnetic transmission
    • G08B3/1008Personal calling arrangements or devices, i.e. paging systems
    • G08B3/1016Personal calling arrangements or devices, i.e. paging systems using wireless transmission
    • G08B3/1025Paging receivers with audible signalling details
    • G08B3/1033Paging receivers with audible signalling details with voice message alert
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • H04L65/65Network streaming protocols, e.g. real-time transport protocol [RTP] or real-time control protocol [RTCP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Artificial Intelligence (AREA)
  • Signal Processing (AREA)
  • General Business, Economics & Management (AREA)
  • Business, Economics & Management (AREA)
  • Electromagnetism (AREA)
  • General Physics & Mathematics (AREA)
  • Telephonic Communication Services (AREA)

Abstract

本发明提供了一种基于AI技术的实时对讲干预与告警平台包括通讯服务器、媒体资源控制服务器和AI语音训练与识别平台,通讯服务器用于提供通信服务,并将通信内容实时转化成音频媒体流,发送到媒体资源控制服务器;媒体资源控制服务器用于将音频媒体流转换成文本内容发送到AI语音训练与识别平台;AI语音训练与识别平台用于识别文本内容中涉及的敏感信息,以及识别文本内容中涉及的音频进行分类,发送告警信息至通讯服务器,启动通讯服务器中的干预模块。本发明用于识别多个业务场景中出现的敏感词、暴力恐吓、求救声、异常声音等风险信息,启动对应的干预动作,以达到净化会话环境,及时处理意外事件发生的目的。

The present invention provides a real-time intercom intervention and alarm platform based on AI technology, including a communication server, a media resource control server and an AI voice training and recognition platform. The communication server is used to provide communication services, and convert the communication content into an audio media stream in real time, and send it to the media resource control server; the media resource control server is used to convert the audio media stream into text content and send it to the AI voice training and recognition platform; the AI voice training and recognition platform is used to identify sensitive information involved in the text content, and identify the audio involved in the text content for classification, send alarm information to the communication server, and start the intervention module in the communication server. The present invention is used to identify risk information such as sensitive words, violent threats, cries for help, abnormal sounds, etc. that appear in multiple business scenarios, and start corresponding intervention actions to achieve the purpose of purifying the conversation environment and handling unexpected events in a timely manner.

Description

Real-time intercom intervention and alarm platform based on AI technology
Technical Field
The invention belongs to the technical field of intercom intervention and alarm, and particularly relates to a real-time intercom intervention and alarm platform based on an AI technology.
Background
In daily life, communication is ubiquitous, and communication modes are also various. Under a specific scene, even if an administrator monitors the call, it is difficult to reflect and intervene sensitive information in the call in real time.
Disclosure of Invention
In order to solve the defects of the prior art, the invention aims to provide a real-time intercom intervention and alarm platform based on an AI technology so as to overcome the defects in the prior art.
In order to achieve the above purpose, the invention provides a real-time intercom intervention and warning platform based on an AI technology, which comprises a communication server, a media resource control server and an AI voice training and recognition platform, wherein the communication server is electrically connected and in signal connection with the media resource control server, the communication server comprises an MRCP client, a user agent, a session communication module and an intervention module, wherein the user agent is used for accessing a plurality of user terminals, the session communication module is used for acquiring communication contents of the user terminals in real time and converting the communication contents into audio media streams, the MRCP client is used for pulling the audio media streams in real time and sending the audio media streams to the media resource control server, the media resource control server is electrically connected and in signal connection with the AI voice training and recognition platform, the AI voice training and recognition platform is electrically connected and in signal connection with the communication server, the AI voice training and recognition platform comprises a voice recognition engine, a training module and a warning module, the voice recognition engine is used for receiving text contents of the media resource control server, the training module is used for acquiring the communication contents of the user terminals in real time and converting the audio media contents into audio media streams, the audio media streams are transmitted to the media streams to the media resource control server, the media resource control server is electrically connected and in signal connection with the AI voice training and the media training and recognition platform comprises a voice recognition engine, the training module and a warning module, the voice recognition module is used for receiving the text contents of the media information, the voice recognition engine is used for receiving the text contents of the media control server, the voice resources and the media audio resources and the voice control server, the voice server is used for receiving the audio signals, and a warning information and a warning server, and a warning server is used for sending the audio signals and a warning server.
According to the technical scheme, when the user terminals are in conversation, conversation content is converted into audio streams in real time and sensitive words are identified, when sensitive information is detected, which sound is the current audio or what state or scene sound is, alarm information of corresponding categories is sent to the communication server, timely intervention is achieved, the method can be used for identifying sensitive words, distress sounds, abnormal sounds and other risk information appearing in a plurality of business scenes, and corresponding intervention actions are started, so that the purposes of purifying conversation environments and timely processing accidents are achieved.
As a further explanation of the real-time intercom intervention and alert platform based on AI technology of the present invention, preferably, the media resource control server includes a master server and a plurality of slave servers, the MRCP client communicates with the master server, the master server communicates with the plurality of slave servers, so that the MRCP client sends the IP address and port number of the user terminal to the master server, and the master server controls the idle slave servers to establish communication connection with the MRCP client.
As a further explanation of the real-time intercom intervention and warning platform based on the AI technology, the voice recognition engine preferably comprises a word segmentation module and a semantic analysis module, wherein the word segmentation module is used for dividing text content into word vector sets according to word segmentation sets and transmitting the word vector sets to the semantic analysis module, and the semantic analysis module is used for carrying out semantic analysis on the word vector sets, preliminarily determining classification types corresponding to the word vector sets and transmitting the classification types to the training module.
As a further explanation of the real-time intercom intervention and warning platform based on the AI technology, the warning module preferably comprises an encoder, a warning information generating module and a warning information transmitting module, wherein the training module is connected with the encoder, the encoder is connected with the warning information generating module, the warning information generating module is connected with the warning information transmitting module, the warning information transmitting module is connected with the intervention module, the encoder is used for receiving the sensitive information and the audio classification result of the training module and generating corresponding message codes and sending the corresponding message codes to the warning information generating module, the warning information generating module is used for generating warning information with the message codes after receiving the message codes, and the warning information transmitting module is used for sending the warning information with the message codes to the intervention module.
As a further explanation of the real-time intercom intervention and alarm platform based on the AI technology, the intervention module preferably comprises an alarm information receiving module, a decoder, an interruption intervention module, a reminding intervention module and a keyword silencing module, wherein the alarm information transmitting module is connected with the alarm information receiving module, the alarm information receiving module is connected with the decoder, the decoder is respectively connected with the interruption intervention module, the reminding intervention module and the keyword silencing module, the alarm information receiving module is used for receiving alarm information with message codes of the alarm information transmitting module and sending the alarm information to the decoder, the decoder is used for analyzing the message codes and starting the interruption intervention module, the reminding intervention module or the keyword silencing module according to the message codes, the interruption intervention module is used for cutting off a call of a user terminal, the reminding intervention module is used for sending a text warning or inserting voice to the user terminal, and the keyword silencing module is used for silencing sensitivity in communication content of the user terminal.
As a further illustration of the AI-based real-time intercom intervention and alert platform of the present invention, preferably, the AI voice training and recognition platform includes a database module for storing a sensitive word dataset and an audio classification dataset, providing model training data for a training set and a testing set for a training module.
As a further explanation of the real-time intercom intervention and alert platform based on AI technology of the present invention, preferably, the communication server, the media resource control server and the AI speech training and recognition platform are connected by real-time media streaming.
As a further illustration of the AI-technology based real-time intercom intervention and alert platform of the present invention, preferably the communication server communicates with the media resource control server via SIP protocol.
As a further illustration of the AI technology-based real-time intercom intervention and alert platform of the present invention, preferably the speech recognition model and the sound classification model are deployed on a private CPU/GPU server.
Through the technical scheme, the model is used in an intranet or no-intranet environment, so that data privacy is ensured.
As a further illustration of the real-time intercom intervention and alert platform based on AI technology of the present invention, preferably, the speech recognition model and the sound classification model are the speech recognition systems DeepASR based on PADDLEPADDLE FLUID and Kaldi.
Through the technical scheme, deepASR utilizes the Fluid framework to complete configuration and training of an acoustic model in voice recognition, integrates a Kaldi decoder, achieves quick and large-scale training of the acoustic model, and utilizes Kaldi to complete complex voice data preprocessing and final decoding processes.
The invention has the beneficial effects that the invention provides an intervention and alarm platform supporting real-time intercom, a plurality of different user terminals can be accessed through the communication server, when the user terminals are in conversation, the conversation content is converted into an audio stream and sensitive words are identified in real time through the communication connection established between the communication server and the media resource control server as well as between the AI voice training and identification platform, when sensitive information is detected, which sound is the current audio or what state or scene sound is, the alarm information of the corresponding type is sent to the communication server, timely intervention is realized, and the invention can be used for identifying sensitive words, distress sounds, abnormal sounds and other risk information appearing in a plurality of business scenes, starting corresponding intervention actions, so as to achieve the purposes of purifying conversation environment and timely processing occurrence of unexpected events.
Drawings
Fig. 1 is a schematic structural diagram of the real-time intercom intervention and warning platform based on AI technology of the present invention.
Fig. 2 is a schematic diagram of a structure of a media resource control server according to the present invention.
FIG. 3 is a schematic diagram of the speech recognition engine of the present invention.
FIG. 4 is a schematic diagram of the structure of the alarm module and the intervention module of the present invention.
Detailed Description
For a further understanding of the structure, features, and other objects of the invention, reference should now be made in detail to the accompanying drawings of the preferred embodiments of the invention, which are illustrated in the accompanying drawings and are for purposes of illustrating the concepts of the invention and not for limiting the invention.
First, referring to fig. 1, fig. 1 is a schematic structural diagram of an AI technology-based real-time intercom intervention and warning platform of the present invention. The real-time intercom intervention and alarm platform based on the AI technology comprises a communication server 1, a media resource control server 2 and an AI voice training and recognition platform 3.
The communication server 1 is electrically and signally connected with the media resource control server 2, and is used for providing communication services, the communication server 1 comprises an MRCP client 11, a user agent 12, a session communication component 13 and an intervention module 14, wherein the user agent 12 is used for accessing a plurality of user terminals, the session communication component 13 is connected with the user agent 12, the session communication component 13 is used for acquiring the communication content of the user terminals in real time and converting the communication content into audio media streams, the MRCP client 11 is connected with the session communication component 13, the MRCP client 11 is connected with the media resource control server 2, and the MRCP client 11 is used for pulling the audio media streams in real time and sending the audio media streams to the media resource control server 2.
The media resource control server 2 is electrically and signally connected to the AI speech training and recognition platform 3, and the media resource control server 2 is configured to convert the audio media stream into text content and send the text content to the AI speech training and recognition platform 3. As shown in fig. 2, the media resource control server 2 includes a master server 21 and a plurality of slave servers 22, where the MRCP client 11 communicates with the master server 21, and the master server 21 communicates with the plurality of slave servers 22, so that the MRCP client 11 sends the IP address and the port number of the user terminal to the master server 21, and the master server 21 controls the idle slave servers 22 to establish a communication connection with the MRCP client 11.
The AI voice training and recognition platform 3 is electrically and signally connected with the communication server 1, and the AI voice training and recognition platform 3 comprises a voice recognition engine 31, The training module 31 and the alarm module 33, wherein the media resource control server 2 is connected with the voice recognition engine 31, the voice recognition engine 31 is used for receiving text content of the media resource control server 2, the voice recognition engine 31 is connected with the training module 32, the voice recognition engine 31 comprises a word segmentation module 311 and a semantic analysis module 312 as shown in fig. 3, the word segmentation module 311 is used for dividing the text content into word vector sets according to word segmentation sets and transmitting the word vector sets to the semantic analysis module 312, and the semantic analysis module 312 is used for carrying out semantic analysis on the word vector sets, preliminarily determining classification categories corresponding to the word vector sets and transmitting the classification categories to the training module 32. The training module 32 comprises a voice recognition model and a sound classification model, the training module 32 is used for recognizing sensitive information related to text contents through the voice recognition model and classifying audio related to the text contents through the sound classification model, the training module 32 is connected with the alarm module 33, the alarm module 33 is connected with the intervention module 14, the alarm module 33 is used for generating corresponding types of alarm information with message codes and sending the alarm information to the communication server 1 when the training module 32 detects the sensitive information, and the intervention module 14 in the communication server 1 starts corresponding intervention actions according to the alarm information with the message codes. wherein, as shown in FIG. 4, the alarm module 33 comprises an encoder 331, The system comprises an alarm information generating module 332 and an alarm information transmitting module 333, wherein the training module 32 is connected with an encoder 331, the encoder 331 is connected with the alarm information generating module 332, the alarm information generating module 332 is connected with the alarm information transmitting module 333, the alarm information transmitting module 333 is connected with the intervention module 14, the encoder 331 is used for receiving sensitive information and an audio classification result of the training module 32 and generating corresponding message codes and sending the corresponding message codes to the alarm information generating module 332, the alarm information generating module 332 is used for generating alarm information with the message codes after receiving the message codes, and the alarm information transmitting module 333 is used for sending the alarm information with the message codes to the intervention module 14. The intervention module 14 comprises an alarm information receiving module 141, a decoder 142, an interrupt intervention module 143, a reminding intervention module 144 and a keyword silencing module 145, wherein the alarm information transmitting module 333 is connected with the alarm information receiving module 141, the alarm information receiving module 141 is connected with the decoder 142, the decoder 142 is respectively connected with the interrupt intervention module 143, the reminding intervention module 144 and the keyword silencing module 145, the alarm information receiving module 141 is used for receiving alarm information with message codes of the alarm information transmitting module 333 and transmitting the alarm information to the decoder 142, and the decoder 142 is used for analyzing the message codes and starting the interrupt intervention module 143 according to the message codes, the reminding intervention module 144 or the keyword silencing module 145, the interruption intervention module 143 is used for cutting off the call of the user terminal, the reminding intervention module 144 is used for sending out text warning or inserting voice to the user terminal, and the keyword silencing module 145 is used for silencing sensitive words in the communication content of the user terminal. Thus, the intervention actions initiated by the intervention module 14 include cutting off the call, alerting, inserting the call, and silencing the sensitive word for the communication content of the user terminal. The encoder 331 in the alarm module 33 and the decoder 142 in the intervention module 14, and the alarm information transmitting module 333 in the alarm module 33 and the alarm information receiving module 141 in the intervention module 14 are matched with each other, so as to ensure the accuracy of correctly transmitting and decoding the alarm information, and also improve the security. When the AI voice training and recognition platform detects sensitive information, the AI voice training and recognition platform can recognize what kind of voice is currently provided or what state or scene is provided, send corresponding type of alarm information to the communication server, and perform timely intervention, and can be used for recognizing sensitive words, distress sounds, abnormal sounds and other risk information appearing in a plurality of business scenes, and starting corresponding intervention actions so as to achieve the purposes of purifying conversation environment and timely processing occurrence of unexpected events.
Preferably, the AI speech training and recognition platform 3 further comprises a database module 34, the database module 34 being connected to the training module 32, the database module 34 being adapted to store the sensitive word data set and the audio classification data set, and to provide the training module 32 with model training data of the training set and the test set. The AI voice training and recognition platform 3 is provided with a real-time voice transcription interface adopting a connection mode of websocket protocol, and can realize that the recognition result is obtained while uploading the audio, and the audio stream is recognized as characters in real time. The speech recognition model and the sound classification model are PADDLEPADDLE FLUID and Kaldi based speech recognition system DeepASR. DeepASR utilizes a Fluid framework to complete the configuration and training of an acoustic model in voice recognition, integrates a Kaldi decoder, realizes the rapid and large-scale training of the acoustic model, and utilizes Kaldi to complete complex voice data preprocessing and final decoding processes. The trained voice recognition model and the trained voice classification model are deployed on a private CPU/GPU server, and the models are used in an intranet or non-intranet environment to ensure data privacy. The model may also be published as an API and used by calling the model.
Preferably, the communication server 1, the media resource control server 2 and the AI speech training and recognition platform 3 are connected through real-time media streaming. The communication server 1 communicates with the media resource control server 2 via the SIP protocol. The MRCP client 11 comprises a SIP protocol stack and an MRCP protocol stack, wherein the MRCP protocol stack of the MRCP client 11 is used for calling an API interface of the media resource control server 2, the API interface creates a SIP dialog through the SIP protocol stack of the MRCP client 11 and carries information of the media resource control server 2, and the SIP protocol stack of the MRCP client 11 is used for initializing a media session for the media resource control server 2 through RTP and creating a control session for the media resource control server 2 through the MRCP protocol stack of the MRCP client 11. The media resource control server 2 also comprises an MRCP protocol stack and a SIP protocol stack, and the media resource control server 2 comprises various media resources such as speech recognition, speech synthesis, speech recording, speaker verification, voiceprint matching.
The real-time intercom intervention and warning platform based on the AI technology can be applied to live audio, live equipment is connected to a user agent 12 of a communication server 1, the communication server 1 sends the live audio to a media resource control server 2, the media resource control server 2 processes the audio into texts and sends the texts to an AI voice training and recognition platform 3, the AI voice training and recognition platform 3 detects whether the texts of the live audio have sensitive words, the sensitive words can be silenced, warning information is sent to the communication server 1 to carry out silencing, or live broadcasting is cut off, warning information is sent to the live broadcasting room, labor supervision cost is saved, content safety of the live broadcasting room is ensured, network environment is purified, the communication server 1 can also be applied to recognition of conversation content, timely intervention processing of accidents occur, the communication server can also be applied to public places such as schools and banks, the AI voice training and recognition platform 3 recognizes audio content, and timely processing of accidents occur.
It should be noted that the foregoing summary and the detailed description are intended to demonstrate practical applications of the technical solution provided by the present invention, and should not be construed as limiting the scope of the present invention. Various modifications, equivalent alterations, or improvements will occur to those skilled in the art, and are within the spirit and principles of the invention. The scope of the invention is defined by the appended claims.

Claims (8)

1. The real-time intercom intervention and alarm platform based on the AI technology is characterized by comprising a communication server (1), a media resource control server (2) and an AI voice training and recognition platform (3), wherein,
The communication server (1) is electrically connected with the media resource control server (2) and is in signal connection, the communication server (1) comprises an MRCP client (11), a user agent (12), a session communication component (13) and an intervention module (14), wherein the user agent (12) is used for accessing a plurality of user terminals, the session communication component (13) is used for acquiring communication contents of the user terminals in real time and converting the communication contents into audio media streams, and the MRCP client (11) is used for pulling the audio media streams in real time and sending the audio media streams to the media resource control server (2);
The media resource control server (2) is electrically and signally connected with the AI voice training and recognition platform (3), and the media resource control server (2) is used for converting the audio media stream into text content and transmitting the text content to the AI voice training and recognition platform (3);
The AI voice training and identifying platform (3) is electrically and signally connected with the communication server (1), the AI voice training and identifying platform (3) comprises a voice identifying engine (31), a training module (32) and an alarm module (33), wherein the voice identifying engine (31) is used for receiving text content of the media resource control server (2), the training module (32) is used for identifying sensitive information related in the text content through a voice identifying model and classifying audio related in the text content through a voice classifying model, the alarm module (33) is used for generating alarm information with message codes of corresponding types and sending the alarm information to the communication server (1), and the intervention module (14) in the communication server (1) starts corresponding intervention actions according to the alarm information with the message codes;
The alarm module (33) comprises an encoder (331), an alarm information generating module (332) and an alarm information transmitting module (333), wherein the training module (32) is connected with the encoder (331), the encoder (331) is connected with the alarm information generating module (332), the alarm information generating module (332) is connected with the alarm information transmitting module (333), the alarm information transmitting module (333) is connected with the intervention module (14), the encoder (331) is used for receiving sensitive information and audio classification results of the training module (32) and generating corresponding message codes and sending the corresponding message codes to the alarm information generating module (332), the alarm information generating module (332) is used for generating alarm information with the message codes after receiving the message codes, and the alarm information transmitting module (333) is used for sending the alarm information with the message codes to the intervention module (14);
The intervention module (14) comprises an alarm information receiving module (141), a decoder (142), an interrupt intervention module (143), a reminding intervention module (144) and a keyword silencing module (145), wherein the alarm information transmitting module (333) is connected with the alarm information receiving module (141), the alarm information receiving module (141) is connected with the decoder (142), the decoder (142) is respectively connected with the interrupt intervention module (143), the reminding intervention module (144) and the keyword silencing module (145), the alarm information receiving module (141) is used for receiving alarm information with message codes of the alarm information transmitting module (333) and sending the alarm information to the decoder (142), the decoder (142) is used for analyzing the message codes and starting the interrupt intervention module (143), the reminding intervention module (144) or the keyword silencing module (145) according to the message codes, the interrupt intervention module (143) is used for cutting off a call of a user terminal, the reminding intervention module (144) is used for sending words or inserting voices to the user terminal, and the keyword silencing module (145) is used for processing sensitive words in communication content of the user terminal.
2. The AI technology-based real-time intercom intervention and alert platform of claim 1 wherein the media resource control server (2) comprises a master server (21) and a plurality of slave servers (22), the MRCP client (11) communicates with the master server (21), the master server (21) communicates with the plurality of slave servers (22) such that the MRCP client (11) sends the IP address and port number of the user terminal to the master server (21), and the master server (21) controls the idle slave servers (22) to establish a communication connection with the MRCP client (11).
3. The real-time intercom intervention and warning platform as in claim 1 wherein said speech recognition engine (31) comprises a word segmentation module (311) and a semantic analysis module (312), said word segmentation module (311) is configured to divide text content into word vector sets according to word sets and transmit the word vector sets to said semantic analysis module (312), said semantic analysis module (312) is configured to perform semantic analysis on said word vector sets, and said word vector sets are initially determined into classification categories corresponding to said word vector sets and transmit said classification categories to said training module (32).
4. The real-time intercom intervention and alert platform based on AI technology of claim 1 wherein the AI speech training and recognition platform (3) comprises a database module (34), the database module (34) for storing a sensitive word dataset and an audio classification dataset, model training data for a training set and a test set provided to the training module (32).
5. The AI-technology-based real-time intercom intervention and alert platform of claim 1 wherein the communication server (1), the media resource control server (2) and the AI speech training and recognition platform (3) are connected by real-time media streaming.
6. The AI-technology-based real-time intercom intervention and alert platform of claim 1 wherein the communication server (1) communicates with the media resource control server (2) via SIP protocol.
7. The AI-technology-based real-time intercom intervention and alert platform of claim 1 wherein the speech recognition model and the sound classification model are deployed on a private CPU/GPU server.
8. The AI-technology-based real-time intercom intervention and alert platform of claim 1 wherein said speech recognition model and said sound classification model are PADDLEPADDLE FLUID and Kaldi-based speech recognition system DeepASR.
CN202110108101.0A 2021-01-27 2021-01-27 Real-time intercom intervention and alarm platform based on AI technology Active CN113556313B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110108101.0A CN113556313B (en) 2021-01-27 2021-01-27 Real-time intercom intervention and alarm platform based on AI technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110108101.0A CN113556313B (en) 2021-01-27 2021-01-27 Real-time intercom intervention and alarm platform based on AI technology

Publications (2)

Publication Number Publication Date
CN113556313A CN113556313A (en) 2021-10-26
CN113556313B true CN113556313B (en) 2024-12-24

Family

ID=78130065

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110108101.0A Active CN113556313B (en) 2021-01-27 2021-01-27 Real-time intercom intervention and alarm platform based on AI technology

Country Status (1)

Country Link
CN (1) CN113556313B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115240652A (en) * 2022-06-02 2022-10-25 福建新大陆通信科技股份有限公司 Emergency broadcast sensitive word recognition method
CN116229987B (en) * 2022-12-13 2023-11-21 广东保伦电子股份有限公司 Campus voice recognition method, device and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111158915A (en) * 2019-12-31 2020-05-15 厦门快商通科技股份有限公司 Master-slave relationship switching method, slave server, master server and system
CN214799530U (en) * 2021-01-27 2021-11-19 福建环宇通信息科技股份公司 A real-time intercom intervention and alarm platform based on AI technology

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20130114852A (en) * 2012-04-10 2013-10-21 삼성에스엔에스 주식회사 Voice processing device and method thereof using mrcp
US9413891B2 (en) * 2014-01-08 2016-08-09 Callminer, Inc. Real-time conversational analytics facility
CN107918633B (en) * 2017-03-23 2021-07-02 广州思涵信息科技有限公司 Sensitive public opinion content identification method and early warning system based on semantic analysis technology
CN110675951A (en) * 2019-08-26 2020-01-10 北京百度网讯科技有限公司 Intelligent disease diagnosis method and device, computer equipment and readable medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111158915A (en) * 2019-12-31 2020-05-15 厦门快商通科技股份有限公司 Master-slave relationship switching method, slave server, master server and system
CN214799530U (en) * 2021-01-27 2021-11-19 福建环宇通信息科技股份公司 A real-time intercom intervention and alarm platform based on AI technology

Also Published As

Publication number Publication date
CN113556313A (en) 2021-10-26

Similar Documents

Publication Publication Date Title
US8484040B2 (en) Social analysis in multi-participant meetings
US8908837B2 (en) Methods and systems for automatically providing an emergency service call handler with context specific emergency service protocols
US20200012724A1 (en) Bidirectional speech translation system, bidirectional speech translation method and program
US20150106091A1 (en) Conference transcription system and method
JP5739009B2 (en) System and method for providing conference information
CN113691685A (en) Automatic correction of wrong audio settings
WO2014120291A1 (en) System and method for improving voice communication over a network
KR20110008211A (en) Open architecture based on real-time multilingual communication service with different domains
CN107112014A (en) Application foci in voice-based system
WO2020238209A1 (en) Audio processing method, system and related device
CN111784971B (en) Alarm processing method and system, computer readable storage medium and electronic device
CN113556313B (en) Real-time intercom intervention and alarm platform based on AI technology
US12477065B2 (en) Emergency communication system with contextual snippets
CN106301811A (en) Realize the method and device of multimedia conferencing
CN106409283A (en) Audio frequency-based man-machine mixed interaction system and method
US20190180735A1 (en) Ambient sound classification based message routing for local security and remote internet query systems
CN214799530U (en) A real-time intercom intervention and alarm platform based on AI technology
CN110740212B (en) Call answering method and device based on intelligent voice technology and electronic equipment
CN113571048B (en) Audio data detection method, device, equipment and readable storage medium
CN109634554B (en) Method and device for outputting information
CN114328867A (en) Method and device for intelligent interruption in man-machine dialogue
US20190228774A1 (en) Transcription of communications
EP2913822B1 (en) Speaker recognition
CN112509582A (en) Quality inspection method, system, equipment and storage medium for voice call
CN118609536A (en) Audio generation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant