[go: up one dir, main page]

CN207302623U - A kind of remote speech processing system - Google Patents

A kind of remote speech processing system Download PDF

Info

Publication number
CN207302623U
CN207302623U CN201720914569.8U CN201720914569U CN207302623U CN 207302623 U CN207302623 U CN 207302623U CN 201720914569 U CN201720914569 U CN 201720914569U CN 207302623 U CN207302623 U CN 207302623U
Authority
CN
China
Prior art keywords
medium data
electric terminal
voice
word processing
processing system
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201720914569.8U
Other languages
Chinese (zh)
Inventor
王玮
谈冰
崔芳
朱胜强
苏文畅
王兆育
殷丹丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Hear Technology Co Ltd
Original Assignee
Anhui Hear Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Hear Technology Co Ltd filed Critical Anhui Hear Technology Co Ltd
Priority to CN201720914569.8U priority Critical patent/CN207302623U/en
Application granted granted Critical
Publication of CN207302623U publication Critical patent/CN207302623U/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The utility model discloses a kind of remote speech processing system.The system includes:Electric terminal, for sending the multi-medium data with voice and the corresponding speech processes instruction of the multi-medium data by network;Remote server, it is connected with the electric terminal by network, for receiving the multi-medium data and speech processes instruction, the corresponding word processing of voice in the speech processes instruction generation multi-medium data by the word processing result as a result, and return to the electric terminal.The utility model embodiment, to realize reduction manual sorting, improves the work efficiency for arranging multimedia data contents, while the multimedia data contents real-time display of arrangement is come out by the interaction of electric terminal and remote server.

Description

A kind of remote speech processing system
Technical field
The utility model embodiment is related to speech ciphering equipment technology, more particularly to a kind of remote speech processing system.
Background technology
Currently in the work of the industries such as media, education, shorthand, worker is needed real-time/non-real time multi-medium data Data to arrange is into written form, and substantial amounts of multimedia data contents are generally required to take more time and arranged.
At present, multi-medium data easily, timely cannot be converted into manuscript form by the existing product of in the market, very much When user when recording arranges manuscript, it is necessary to manually remove playback, manuscript is then write, when completing manuscript, it is necessary to text Role in original text distinguishes, then proofreading again, this can expend many manpowers.The subtitle product on existing market is also all desirable at the same time Timing code is manually entered and manually adjusts, the duration that user also needs more than 3 times goes to handle the multi-medium data of same time Subtitle works, and if necessary to arrange bilingual subtitles, that just needs more times to go to translate.
Usually arrange multi-medium data way be manual operation, not only expend a large amount of manpowers work at the same time it is inefficient, And the work values of early period can not be embodied.
Utility model content
The utility model provides a kind of remote speech processing system, to realize reduction manual sorting, improves and arranges multimedia The work efficiency of data content, while the multimedia data contents real-time display of arrangement is come out.
The utility model embodiment provides a kind of remote speech processing system, which includes:
Electric terminal, for sending the multi-medium data with voice and the corresponding voice of the multi-medium data by network Process instruction;
Remote server, is connected with the electric terminal by network, for receiving at the multi-medium data and voice Reason instruction, generates the corresponding word processing of voice in the multi-medium data as a result, and will according to speech processes instruction The word processing result returns to the electric terminal.
The utility model embodiment, to realize reduction manual sorting, is carried by the interaction of electric terminal and remote server Height arranges the work efficiency of multimedia data contents, while the multimedia data contents real-time display of arrangement is come out.
Brief description of the drawings
Fig. 1 is a kind of structure chart of the remote speech processing system provided in the utility model embodiment one;
Fig. 2 is the structure of electric terminal in a kind of remote speech processing system provided in the utility model embodiment one Figure.
Embodiment
It is new to this practicality below in conjunction with the accompanying drawings in order to make the purpose of this utility model, technical solution and advantage clearer Type specific embodiment is described in further detail.It is understood that specific embodiment described herein is used only for solving Release the utility model, rather than the restriction to the utility model.
Embodiment one
Fig. 1 is a kind of structure chart for remote speech system that the utility model embodiment one provides, and the present embodiment is applicable In effectively arrangement multimedia data contents and the situation of real-time display out.
As shown in Figure 1, the system comprises:Electric terminal 110 and remote server 120, wherein:
Electric terminal 110 is used to send the multi-medium data with voice and the corresponding language of the multi-medium data by network Sound process instruction.
Wherein, electric terminal can be but be not limited to mobile terminal (for example, tablet computer, smart mobile phone etc.), dress Equipment (for example, intelligent watch, motion bracelet etc.).
Wherein, network used in system can be public network, LAN or other private network forms.
Wherein, multi-medium data can be voice data and/or video data (referring to the video data with voice).
Remote server 120 is connected with electric terminal 110 by network, for receiving at the multi-medium data and voice Reason instruction, generates the corresponding word processing of voice in the multi-medium data as a result, and will according to speech processes instruction The word processing result returns to electric terminal 110.
The operation principle of the remote speech system:
User sends the multi-medium data with voice using electric terminal 110 by network, and is sent out to remote server 120 Send the multi-medium data corresponding speech processes instruction.Remote server 120 receives the multi-medium data and speech processes Instruction, generates the corresponding word processing of voice in the multi-medium data as a result, and by institute according to speech processes instruction State word processing result and return to the electric terminal 110.
Based on the above technical solutions, electric terminal 110 is additionally operable to receive the text that remote server 120 returns Word processing result.Further, electric terminal 110 further includes display screen, for showing the word processing result.
Sender that can be not only as multi-medium data for electric terminal 110 and word processing result Recipient, can also possess other more rich functions.As shown in Fig. 2, electric terminal 110 further sets microphone 111, use In collection voice data as multi-medium data, generation main body and hair equivalent to electric terminal 110 while as voice data Main body is sent, realizes the local generation, quick transmission and quick processing of voice data, improves the processing speed of voice in voice data Degree.Similarly, camera 112 further can also be set in electric terminal 110, camera 112 and microphone 111 coordinate collection Video data with voice as multi-medium data, equivalent to electric terminal 110 at the same time as video data generation main body with Main body is sent, realizes the local generation, quick transmission and quick processing of video data, improves the processing speed of voice in video data Degree.
For remote server 120, its can built-in a variety of modules for carrying out speech processes, correspondence realizes various languages Sound process instruction.Such as sound identification module is set in remote server 120, for the voice in the multi-medium data Carry out Text region and generate the first word processing result.
Wherein, the voice that sound identification module can be changed directly in the multi-medium data obtains the first word processing knot Fruit, can also carry out word processing to the different role automatic distinguishing of the voice in multi-medium data, realize text role's vocal print It is automatically separated.Voice in multi-medium data has different role characteristics under different speech production scenes.Such as adopting The multi-medium data of visit process generation, the main reporter included in acquisition and multiple and different interviewees;Taught in education The multi-medium data generated during, the main religion including Faculty and Students and the interactive link awarded;In talk shorthand process The multi-medium data of middle generation, mainly includes talker and the memcon by talker.In sound identification module to multimedia When voice in data is identified, the language of identical vocal print feature can be will be provided with to the vocal print feature confirmation of synchronization in voice Sound confirmation sends for same role., can be with for the voice recognition result of same role during voice is identified Add identical role's mark, sound identification module can with so that complete in the multi-medium data of reporter and interviewee's generation Voice carries out being converted to word processing result word by word and sentence by sentence, in chronological order.
On the basis of the first word processing result that sound identification module obtains, remote server 120 can be with built-in Translation module, for the languages for specifying the character translation in the first word processing result into the speech processes, To generate the second word processing result.
Wherein, translation module refers to the character translation in the first word processing result into the speech processes Fixed languages, without adjustment time code, real time translation, to generate the second word processing as a result, improving word processing efficiency.
The presentation mode of various word processing results is instructed according to speech processes and determined, such as the first word processing knot Fruit is text or subtitle file.Certainly, the second word processing result can also be text or subtitle file.
What is recorded in text is in the word content obtained according to speech recognition or the word obtained to speech recognition Hold the word content derived after being translated, text general user individually opens and checks.
The word content also included in text recorded in subtitle file, but word content is built in units of sentence Stood and multi-medium data in time shaft correspondence, can with the broadcasting of multi-medium data simultaneous display.If The generation of some video data has the first word processing result (subtitle file) and the second word processing result (subtitle file), can be with Selected in electric terminal 110 according to user while show multilingual subtitle file.
The technical solution of the present embodiment, by the interaction of electric terminal and remote server, to realize reduction manual sorting, The work efficiency for arranging multimedia data contents is improved, while the multimedia data contents real-time display of arrangement is come out.
Note that it above are only the preferred embodiment and institute's application technology principle of the utility model.Those skilled in the art's meeting Understand, the utility model is not limited to specific embodiment described here, can carry out for a person skilled in the art various bright Aobvious change, readjust and substitute without departing from the scope of protection of the utility model.Therefore, although passing through above example The utility model is described in further detail, but the utility model is not limited only to above example, is not departing from In the case that the utility model is conceived, other more equivalent embodiments can also be included, and the scope of the utility model is by appended Right determine.

Claims (9)

  1. A kind of 1. remote speech processing system, it is characterised in that including:
    Electric terminal, for sending the multi-medium data with voice and the corresponding speech processes of the multi-medium data by network Instruction;
    Remote server, is connected with the electric terminal by network, is referred to for receiving the multi-medium data and speech processes Order, the corresponding word processing of voice in the speech processes instruction generation multi-medium data is as a result, and by described in Word processing result returns to the electric terminal.
  2. 2. remote speech processing system according to claim 1, it is characterised in that the electric terminal is additionally operable to receive institute State the word processing result of remote server return.
  3. 3. remote speech processing system according to claim 2, it is characterised in that the electric terminal further includes display Screen, for showing the word processing result.
  4. 4. remote speech processing system according to claim 1, it is characterised in that the electric terminal further includes Mike Wind, for gathering voice data as multi-medium data.
  5. 5. remote speech processing system according to claim 4, it is characterised in that the electric terminal further includes shooting Head, for coordinating video data of the collection with voice as multi-medium data with microphone.
  6. 6. remote speech processing system according to claim 1, it is characterised in that the remote server is known including voice Other module, the first word processing result is generated for carrying out Text region to the voice in the multi-medium data.
  7. 7. remote speech processing system according to claim 6, it is characterised in that the remote server further includes translation Module, for the languages for specifying the character translation in the first word processing result into the speech processes, with life Into the second word processing result.
  8. 8. the remote speech processing system according to claim 6 or 7, it is characterised in that the first word processing result For text or subtitle file.
  9. 9. remote speech processing system according to claim 7, it is characterised in that the second word processing result is text This document or subtitle file.
CN201720914569.8U 2017-07-26 2017-07-26 A kind of remote speech processing system Active CN207302623U (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201720914569.8U CN207302623U (en) 2017-07-26 2017-07-26 A kind of remote speech processing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201720914569.8U CN207302623U (en) 2017-07-26 2017-07-26 A kind of remote speech processing system

Publications (1)

Publication Number Publication Date
CN207302623U true CN207302623U (en) 2018-05-01

Family

ID=62447462

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201720914569.8U Active CN207302623U (en) 2017-07-26 2017-07-26 A kind of remote speech processing system

Country Status (1)

Country Link
CN (1) CN207302623U (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109658919A (en) * 2018-12-17 2019-04-19 深圳市沃特沃德股份有限公司 Interpretation method, device and the translation playback equipment of multimedia file
CN112837675A (en) * 2019-11-22 2021-05-25 阿里巴巴集团控股有限公司 Speech recognition method, device and related system and equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109658919A (en) * 2018-12-17 2019-04-19 深圳市沃特沃德股份有限公司 Interpretation method, device and the translation playback equipment of multimedia file
CN112837675A (en) * 2019-11-22 2021-05-25 阿里巴巴集团控股有限公司 Speech recognition method, device and related system and equipment

Similar Documents

Publication Publication Date Title
CN103646573B (en) A kind of generation method of professional format file of panning mode tutoring system
CN107220228B (en) A kind of teaching recorded broadcast data correction device
CN104777911B (en) A kind of intelligent interactive method based on holographic technique
Tsai Inside the television newsroom: An insider's view of international news translation in Taiwan
CN109614628A (en) A kind of interpretation method and translation system based on Intelligent hardware
US20180130496A1 (en) Method and system for auto-generation of sketch notes-based visual summary of multimedia content
CN105654532A (en) Photo photographing and processing method and system
CN102209184A (en) Electronic apparatus, reproduction control system, reproduction control method, and program therefor
CN105323704A (en) User comment sharing method, device and system
CN113132780A (en) Video synthesis method and device, electronic equipment and readable storage medium
CN202444562U (en) Mobile conference system
CN109324811A (en) It is a kind of for update teaching recorded broadcast data device
CN107330961A (en) A kind of audio-visual conversion method of word and system
CN101309449A (en) Remote translation service method based on mobile phone multimedia message / short message
CN111046148A (en) Intelligent interaction system and intelligent customer service robot
CN110244921A (en) Label printing method, device, electronic equipment and system
CN111276018A (en) Network course recording method and device and terminal
CN109255130A (en) A kind of method, system and the equipment of language translation and study based on artificial intelligence
CN207302623U (en) A kind of remote speech processing system
CN110019058A (en) The sharing method and device of file operation
CN112581965A (en) Transcription method, device, recording pen and storage medium
US9906485B1 (en) Apparatus and method for coordinating live computer network events
Santano et al. Augmented reality storytelling: A transmedia exploration
CN104702758B (en) A terminal and a method for managing a multimedia notepad
CN106162376A (en) A kind of multimedia is compiled as the method and device of video playback file automatically

Legal Events

Date Code Title Description
GR01 Patent grant
GR01 Patent grant