CN207302623U - A kind of remote speech processing system - Google Patents
A kind of remote speech processing system Download PDFInfo
- Publication number
- CN207302623U CN207302623U CN201720914569.8U CN201720914569U CN207302623U CN 207302623 U CN207302623 U CN 207302623U CN 201720914569 U CN201720914569 U CN 201720914569U CN 207302623 U CN207302623 U CN 207302623U
- Authority
- CN
- China
- Prior art keywords
- medium data
- electric terminal
- voice
- word processing
- processing system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 claims abstract description 21
- 230000008569 process Effects 0.000 claims abstract description 21
- 230000009467 reduction Effects 0.000 abstract description 4
- 230000003993 interaction Effects 0.000 abstract description 3
- 230000001755 vocal effect Effects 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 238000012790 confirmation Methods 0.000 description 2
- 235000013399 edible fruits Nutrition 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000001915 proofreading effect Effects 0.000 description 1
Landscapes
- Machine Translation (AREA)
Abstract
The utility model discloses a kind of remote speech processing system.The system includes:Electric terminal, for sending the multi-medium data with voice and the corresponding speech processes instruction of the multi-medium data by network;Remote server, it is connected with the electric terminal by network, for receiving the multi-medium data and speech processes instruction, the corresponding word processing of voice in the speech processes instruction generation multi-medium data by the word processing result as a result, and return to the electric terminal.The utility model embodiment, to realize reduction manual sorting, improves the work efficiency for arranging multimedia data contents, while the multimedia data contents real-time display of arrangement is come out by the interaction of electric terminal and remote server.
Description
Technical field
The utility model embodiment is related to speech ciphering equipment technology, more particularly to a kind of remote speech processing system.
Background technology
Currently in the work of the industries such as media, education, shorthand, worker is needed real-time/non-real time multi-medium data
Data to arrange is into written form, and substantial amounts of multimedia data contents are generally required to take more time and arranged.
At present, multi-medium data easily, timely cannot be converted into manuscript form by the existing product of in the market, very much
When user when recording arranges manuscript, it is necessary to manually remove playback, manuscript is then write, when completing manuscript, it is necessary to text
Role in original text distinguishes, then proofreading again, this can expend many manpowers.The subtitle product on existing market is also all desirable at the same time
Timing code is manually entered and manually adjusts, the duration that user also needs more than 3 times goes to handle the multi-medium data of same time
Subtitle works, and if necessary to arrange bilingual subtitles, that just needs more times to go to translate.
Usually arrange multi-medium data way be manual operation, not only expend a large amount of manpowers work at the same time it is inefficient,
And the work values of early period can not be embodied.
Utility model content
The utility model provides a kind of remote speech processing system, to realize reduction manual sorting, improves and arranges multimedia
The work efficiency of data content, while the multimedia data contents real-time display of arrangement is come out.
The utility model embodiment provides a kind of remote speech processing system, which includes:
Electric terminal, for sending the multi-medium data with voice and the corresponding voice of the multi-medium data by network
Process instruction;
Remote server, is connected with the electric terminal by network, for receiving at the multi-medium data and voice
Reason instruction, generates the corresponding word processing of voice in the multi-medium data as a result, and will according to speech processes instruction
The word processing result returns to the electric terminal.
The utility model embodiment, to realize reduction manual sorting, is carried by the interaction of electric terminal and remote server
Height arranges the work efficiency of multimedia data contents, while the multimedia data contents real-time display of arrangement is come out.
Brief description of the drawings
Fig. 1 is a kind of structure chart of the remote speech processing system provided in the utility model embodiment one;
Fig. 2 is the structure of electric terminal in a kind of remote speech processing system provided in the utility model embodiment one
Figure.
Embodiment
It is new to this practicality below in conjunction with the accompanying drawings in order to make the purpose of this utility model, technical solution and advantage clearer
Type specific embodiment is described in further detail.It is understood that specific embodiment described herein is used only for solving
Release the utility model, rather than the restriction to the utility model.
Embodiment one
Fig. 1 is a kind of structure chart for remote speech system that the utility model embodiment one provides, and the present embodiment is applicable
In effectively arrangement multimedia data contents and the situation of real-time display out.
As shown in Figure 1, the system comprises:Electric terminal 110 and remote server 120, wherein:
Electric terminal 110 is used to send the multi-medium data with voice and the corresponding language of the multi-medium data by network
Sound process instruction.
Wherein, electric terminal can be but be not limited to mobile terminal (for example, tablet computer, smart mobile phone etc.), dress
Equipment (for example, intelligent watch, motion bracelet etc.).
Wherein, network used in system can be public network, LAN or other private network forms.
Wherein, multi-medium data can be voice data and/or video data (referring to the video data with voice).
Remote server 120 is connected with electric terminal 110 by network, for receiving at the multi-medium data and voice
Reason instruction, generates the corresponding word processing of voice in the multi-medium data as a result, and will according to speech processes instruction
The word processing result returns to electric terminal 110.
The operation principle of the remote speech system:
User sends the multi-medium data with voice using electric terminal 110 by network, and is sent out to remote server 120
Send the multi-medium data corresponding speech processes instruction.Remote server 120 receives the multi-medium data and speech processes
Instruction, generates the corresponding word processing of voice in the multi-medium data as a result, and by institute according to speech processes instruction
State word processing result and return to the electric terminal 110.
Based on the above technical solutions, electric terminal 110 is additionally operable to receive the text that remote server 120 returns
Word processing result.Further, electric terminal 110 further includes display screen, for showing the word processing result.
Sender that can be not only as multi-medium data for electric terminal 110 and word processing result
Recipient, can also possess other more rich functions.As shown in Fig. 2, electric terminal 110 further sets microphone 111, use
In collection voice data as multi-medium data, generation main body and hair equivalent to electric terminal 110 while as voice data
Main body is sent, realizes the local generation, quick transmission and quick processing of voice data, improves the processing speed of voice in voice data
Degree.Similarly, camera 112 further can also be set in electric terminal 110, camera 112 and microphone 111 coordinate collection
Video data with voice as multi-medium data, equivalent to electric terminal 110 at the same time as video data generation main body with
Main body is sent, realizes the local generation, quick transmission and quick processing of video data, improves the processing speed of voice in video data
Degree.
For remote server 120, its can built-in a variety of modules for carrying out speech processes, correspondence realizes various languages
Sound process instruction.Such as sound identification module is set in remote server 120, for the voice in the multi-medium data
Carry out Text region and generate the first word processing result.
Wherein, the voice that sound identification module can be changed directly in the multi-medium data obtains the first word processing knot
Fruit, can also carry out word processing to the different role automatic distinguishing of the voice in multi-medium data, realize text role's vocal print
It is automatically separated.Voice in multi-medium data has different role characteristics under different speech production scenes.Such as adopting
The multi-medium data of visit process generation, the main reporter included in acquisition and multiple and different interviewees;Taught in education
The multi-medium data generated during, the main religion including Faculty and Students and the interactive link awarded;In talk shorthand process
The multi-medium data of middle generation, mainly includes talker and the memcon by talker.In sound identification module to multimedia
When voice in data is identified, the language of identical vocal print feature can be will be provided with to the vocal print feature confirmation of synchronization in voice
Sound confirmation sends for same role., can be with for the voice recognition result of same role during voice is identified
Add identical role's mark, sound identification module can with so that complete in the multi-medium data of reporter and interviewee's generation
Voice carries out being converted to word processing result word by word and sentence by sentence, in chronological order.
On the basis of the first word processing result that sound identification module obtains, remote server 120 can be with built-in
Translation module, for the languages for specifying the character translation in the first word processing result into the speech processes,
To generate the second word processing result.
Wherein, translation module refers to the character translation in the first word processing result into the speech processes
Fixed languages, without adjustment time code, real time translation, to generate the second word processing as a result, improving word processing efficiency.
The presentation mode of various word processing results is instructed according to speech processes and determined, such as the first word processing knot
Fruit is text or subtitle file.Certainly, the second word processing result can also be text or subtitle file.
What is recorded in text is in the word content obtained according to speech recognition or the word obtained to speech recognition
Hold the word content derived after being translated, text general user individually opens and checks.
The word content also included in text recorded in subtitle file, but word content is built in units of sentence
Stood and multi-medium data in time shaft correspondence, can with the broadcasting of multi-medium data simultaneous display.If
The generation of some video data has the first word processing result (subtitle file) and the second word processing result (subtitle file), can be with
Selected in electric terminal 110 according to user while show multilingual subtitle file.
The technical solution of the present embodiment, by the interaction of electric terminal and remote server, to realize reduction manual sorting,
The work efficiency for arranging multimedia data contents is improved, while the multimedia data contents real-time display of arrangement is come out.
Note that it above are only the preferred embodiment and institute's application technology principle of the utility model.Those skilled in the art's meeting
Understand, the utility model is not limited to specific embodiment described here, can carry out for a person skilled in the art various bright
Aobvious change, readjust and substitute without departing from the scope of protection of the utility model.Therefore, although passing through above example
The utility model is described in further detail, but the utility model is not limited only to above example, is not departing from
In the case that the utility model is conceived, other more equivalent embodiments can also be included, and the scope of the utility model is by appended
Right determine.
Claims (9)
- A kind of 1. remote speech processing system, it is characterised in that including:Electric terminal, for sending the multi-medium data with voice and the corresponding speech processes of the multi-medium data by network Instruction;Remote server, is connected with the electric terminal by network, is referred to for receiving the multi-medium data and speech processes Order, the corresponding word processing of voice in the speech processes instruction generation multi-medium data is as a result, and by described in Word processing result returns to the electric terminal.
- 2. remote speech processing system according to claim 1, it is characterised in that the electric terminal is additionally operable to receive institute State the word processing result of remote server return.
- 3. remote speech processing system according to claim 2, it is characterised in that the electric terminal further includes display Screen, for showing the word processing result.
- 4. remote speech processing system according to claim 1, it is characterised in that the electric terminal further includes Mike Wind, for gathering voice data as multi-medium data.
- 5. remote speech processing system according to claim 4, it is characterised in that the electric terminal further includes shooting Head, for coordinating video data of the collection with voice as multi-medium data with microphone.
- 6. remote speech processing system according to claim 1, it is characterised in that the remote server is known including voice Other module, the first word processing result is generated for carrying out Text region to the voice in the multi-medium data.
- 7. remote speech processing system according to claim 6, it is characterised in that the remote server further includes translation Module, for the languages for specifying the character translation in the first word processing result into the speech processes, with life Into the second word processing result.
- 8. the remote speech processing system according to claim 6 or 7, it is characterised in that the first word processing result For text or subtitle file.
- 9. remote speech processing system according to claim 7, it is characterised in that the second word processing result is text This document or subtitle file.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201720914569.8U CN207302623U (en) | 2017-07-26 | 2017-07-26 | A kind of remote speech processing system |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201720914569.8U CN207302623U (en) | 2017-07-26 | 2017-07-26 | A kind of remote speech processing system |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN207302623U true CN207302623U (en) | 2018-05-01 |
Family
ID=62447462
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201720914569.8U Active CN207302623U (en) | 2017-07-26 | 2017-07-26 | A kind of remote speech processing system |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN207302623U (en) |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109658919A (en) * | 2018-12-17 | 2019-04-19 | 深圳市沃特沃德股份有限公司 | Interpretation method, device and the translation playback equipment of multimedia file |
| CN112837675A (en) * | 2019-11-22 | 2021-05-25 | 阿里巴巴集团控股有限公司 | Speech recognition method, device and related system and equipment |
-
2017
- 2017-07-26 CN CN201720914569.8U patent/CN207302623U/en active Active
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN109658919A (en) * | 2018-12-17 | 2019-04-19 | 深圳市沃特沃德股份有限公司 | Interpretation method, device and the translation playback equipment of multimedia file |
| CN112837675A (en) * | 2019-11-22 | 2021-05-25 | 阿里巴巴集团控股有限公司 | Speech recognition method, device and related system and equipment |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN103646573B (en) | A kind of generation method of professional format file of panning mode tutoring system | |
| CN107220228B (en) | A kind of teaching recorded broadcast data correction device | |
| CN104777911B (en) | A kind of intelligent interactive method based on holographic technique | |
| Tsai | Inside the television newsroom: An insider's view of international news translation in Taiwan | |
| CN109614628A (en) | A kind of interpretation method and translation system based on Intelligent hardware | |
| US20180130496A1 (en) | Method and system for auto-generation of sketch notes-based visual summary of multimedia content | |
| CN105654532A (en) | Photo photographing and processing method and system | |
| CN102209184A (en) | Electronic apparatus, reproduction control system, reproduction control method, and program therefor | |
| CN105323704A (en) | User comment sharing method, device and system | |
| CN113132780A (en) | Video synthesis method and device, electronic equipment and readable storage medium | |
| CN202444562U (en) | Mobile conference system | |
| CN109324811A (en) | It is a kind of for update teaching recorded broadcast data device | |
| CN107330961A (en) | A kind of audio-visual conversion method of word and system | |
| CN101309449A (en) | Remote translation service method based on mobile phone multimedia message / short message | |
| CN111046148A (en) | Intelligent interaction system and intelligent customer service robot | |
| CN110244921A (en) | Label printing method, device, electronic equipment and system | |
| CN111276018A (en) | Network course recording method and device and terminal | |
| CN109255130A (en) | A kind of method, system and the equipment of language translation and study based on artificial intelligence | |
| CN207302623U (en) | A kind of remote speech processing system | |
| CN110019058A (en) | The sharing method and device of file operation | |
| CN112581965A (en) | Transcription method, device, recording pen and storage medium | |
| US9906485B1 (en) | Apparatus and method for coordinating live computer network events | |
| Santano et al. | Augmented reality storytelling: A transmedia exploration | |
| CN104702758B (en) | A terminal and a method for managing a multimedia notepad | |
| CN106162376A (en) | A kind of multimedia is compiled as the method and device of video playback file automatically |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| GR01 | Patent grant | ||
| GR01 | Patent grant |