CN110769167A - Method for video dubbing based on text-to-speech technology - Google Patents
Method for video dubbing based on text-to-speech technology Download PDFInfo
- Publication number
- CN110769167A CN110769167A CN201911042390.8A CN201911042390A CN110769167A CN 110769167 A CN110769167 A CN 110769167A CN 201911042390 A CN201911042390 A CN 201911042390A CN 110769167 A CN110769167 A CN 110769167A
- Authority
- CN
- China
- Prior art keywords
- file
- video
- text
- dubbing
- audio
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 230000002194 synthesizing effect Effects 0.000 claims abstract description 10
- 239000002131 composite material Substances 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims description 3
- 238000004519 manufacturing process Methods 0.000 abstract description 8
- 238000010586 diagram Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 206010044565 Tremor Diseases 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/222—Studio circuitry; Studio devices; Studio equipment
- H04N5/262—Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
- H04N5/265—Mixing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/033—Voice editing, e.g. manipulating the voice of the synthesiser
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
- G10L13/047—Architecture of speech synthesisers
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/06—Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/222—Studio circuitry; Studio devices; Studio equipment
- H04N5/262—Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
- H04N5/278—Subtitling
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Television Signal Processing For Recording (AREA)
- Studio Circuits (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The invention discloses a method for dubbing video based on a text-to-speech technology, belonging to the technical field of text-to-speech and comprising the following steps: s1: selecting an original video file; s2: inserting dubbing texts; s3: step S2, the text is transmitted to the text-to-speech server section by section, the text-to-speech server generates the dubbing file, the dubbing file is transmitted back to the original text position, and an audio interval is formed; s4: inserting a blank audio file in the audio interval in the step 3; s5: synthesizing audio, namely synthesizing the dubbing file and the blank audio file in the step 3 into a synthesized audio file; s6: decomposing the video file into an original audio file and a new video file; s7: mixing sound, namely mixing the original audio file and the synthesized audio file to obtain a total audio file; s8: mixing the total audio file with the new video file to obtain a synthesized video file; the scheme makes the production of the video dubbing simple and easy to use, and can produce professional video dubbing without professional knowledge.
Description
Technical Field
The invention relates to the technical field of text-to-speech, in particular to a method for dubbing video based on a text-to-speech technology.
Background
Text-to-speech conversion (TTS), also commonly referred to as continuous text-to-speech synthesis, allows an electronic device to receive an input text string and provide a converted representation of the text string in the form of synthesized speech.
In the field of digital multimedia processing, video dubbing belongs to post-production, and generally uses special software, and in a special recording studio, a dubber operates the software to complete dubbing. Firstly, stripping and removing original audio of a video, secondly, determining the interval duration of video frames of a section to be dubbed and the dubbing start time point, then, carrying out voice explanation and synchronous recording by a dubber, carrying out processing on the next section to be dubbed after the explanation is finished, and repeating the steps until all video dubbing is finished.
With the popularity of short videos such as fast tremble and fast hands, a large number of enthusiasts for video production emerge, the traditional video production method is too complex, professional dubbing personnel or professional equipment are needed for dubbing, and in order to meet the dubbing requirement of mass video production and reduce the threshold and cost of video production, the invention provides a video dubbing method, which solves the problems that manual dubbing is troublesome and expensive, and simultaneously solves the problems that manual dubbing has high requirement on equipment, is easy to generate noise, is inconvenient to operate and needs professional dubbing personnel. Meanwhile, the invention makes the production of the video dubbing simple and easy to use, and can produce professional video dubbing without professional knowledge.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a method for dubbing video based on a text-to-speech technology, which solves the problems of complex dubbing process and high requirement on dubbing equipment in the prior art.
The purpose of the invention can be realized by the following technical scheme:
a method for carrying out video dubbing based on a text-to-speech technology comprises the following steps:
s1: selecting an original video file, and storing and importing the video file from a mobile phone;
s2: inserting dubbing texts, and inserting dubbing character texts at different positions of the video;
s3: step S2, the text is transmitted to the text-to-speech server section by section, the text-to-speech server generates the dubbing file, the dubbing file is transmitted back to the original text position, and an audio interval is formed;
s4: inserting a blank audio file in the audio interval in the step 3;
s5: synthesizing audio, namely synthesizing the dubbing file and the blank audio file in the step 3 into a synthesized audio file;
s6: decomposing the video file into an original audio file and a new video file;
s7: mixing sound, namely mixing the original audio file and the synthesized audio file to obtain a total audio file;
s8: and mixing the total audio file with the new video file to obtain a synthesized video file.
As a preferred aspect of the present invention, in step S1, the method for selecting an original video file further includes shooting a video with a mobile phone camera.
As a preferred embodiment of the present invention, in step S2, the dubbing text is inserted by sequentially inserting the start time point and the end time point of the text with the time of the video as the coordinate, and then sequentially labeling the corresponding text in the time array.
In a preferred embodiment of the present invention, in step S2, an identifier of the time interval duration is inserted into the text of the dubbing.
As a preferred embodiment of the present invention, in step S3, the speed and the pitch of the text-to-speech are set.
As a preferable aspect of the present invention, in step S7, before mixing, the sound volumes of the original audio file and the synthesized audio file are set.
As a preferable embodiment of the present invention, step S8 further includes converting the text in step S2 into a subtitle file, and incorporating the subtitle file into the composite video.
As a preferred embodiment of the present invention, the text is converted into a subtitle file to set the size, color, and background color of the text.
The invention has the beneficial effects that:
the technical scheme converts the text into voice by means of a text-to-voice technology, and mixes the voice with the original video sound to generate a new video file. The method comprises the steps of converting characters into voice, setting the speed of voice, generating a blank audio file, selecting a position for starting playing, separating video voice, synthesizing original voice of the video and voice converted from the text into voice, adjusting the volume and splicing the audio and the video. The method and the device enable the production of the video dubbing to be simple and easy to use, and can produce professional video dubbing without professional knowledge.
Drawings
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.
FIG. 1 is a flow chart of a method of the present invention;
FIG. 2 is a schematic diagram of text-to-speech according to the present invention;
FIG. 3 is a schematic diagram of the method of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1 and fig. 2, a method for dubbing a video based on a text-to-speech technology includes the following steps:
s1: selecting an original video file, and storing and importing the video file from a mobile phone;
s2: inserting dubbing texts, and inserting dubbing character texts at different positions of the video;
s3: step S2, the text is transmitted to the text-to-speech server section by section, the text-to-speech server generates the dubbing file, the dubbing file is transmitted back to the original text position, and an audio interval is formed;
s4: inserting a blank audio file in the audio interval in the step 3;
s5: synthesizing audio, namely synthesizing the dubbing file and the blank audio file in the step 3 into a synthesized audio file;
s6: decomposing the video file into an original audio file and a new video file;
s7: mixing sound, namely mixing the original audio file and the synthesized audio file to obtain a total audio file;
s8: and mixing the total audio file with the new video file to obtain a synthesized video file.
In step S1, the method for selecting an original video file further includes shooting a video with a mobile phone camera.
In step S2, the dubbing text is inserted by sequentially inserting the start time point and the end time point of the text with the time of the video as the coordinate, and then sequentially labeling the corresponding text in the time array.
In step S2, an identifier of the customized time interval duration is inserted into the dubbed text to implement the speech pause function, and the text-to-speech server recognizes the identifier of the customized time interval duration and inserts a blank audio correspondingly.
In step S3, the speed and pitch of the text-to-speech are set, and the text-to-speech server adjusts the speed and pitch of the speech in the generated dubbing file according to the setting, which better meets the needs of the use scenario.
In step S7, before mixing, the volume of the original audio file and the synthesized audio file is set, the original audio file is the background music in the original video file, the synthesized audio file is the human voice, and the volume of the original audio file and the synthesized audio file is adjusted to make the human voice and the background music more suitable for the requirement of the use scene.
In step S8, the method further includes converting the text in step S2 into a subtitle file, setting the size, color, and background color of the text, and incorporating the subtitle file into the composite video to implement the subtitle function in the video file, and displaying different sizes and colors of the subtitle as needed.
As shown in fig. 3, after the text-to-speech server converts the text into the dubbing file, the audio and video of the original video are separated to obtain the original audio file and the video file with the same total time, then the volume of the original audio file and the volume of the dubbing file (synthesized audio file) are set according to the user's requirement, then the original audio file and the dubbing file (synthesized audio file) with the same two periods of time are mixed to obtain a synthesized total audio file, and then the synthesized total audio file and the video file are combined to generate the video file (synthesized video file).
The technical scheme converts the text into voice by means of a text-to-voice technology, and mixes the voice with the original video sound to generate a new video file. The method comprises the steps of converting characters into voice, setting the speed of voice, generating a blank audio file, selecting a position for starting playing, separating video voice, synthesizing original voice of the video and voice converted from the text into voice, adjusting the volume and splicing the audio and the video. The technical scheme solves the problems that manual dubbing is troublesome, labor-consuming and expensive, and simultaneously solves the problems that the manual dubbing has high requirements on equipment, is easy to generate noise, is inconvenient to operate and needs professional dubbing personnel and the like. Meanwhile, the scheme enables the making of the video dubbing to be simple and easy to use, and professional video dubbing can be made without professional knowledge.
In the description herein, references to the description of "one embodiment," "an example," "a specific example" or the like are intended to mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
The foregoing shows and describes the general principles, essential features, and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, which are described in the specification and illustrated only to illustrate the principle of the present invention, but that various changes and modifications may be made therein without departing from the spirit and scope of the present invention, which fall within the scope of the invention as claimed.
Claims (8)
1. A method for video dubbing based on text-to-speech technology is characterized in that:
the method comprises the following steps:
s1: selecting an original video file, and storing and importing the video file from a mobile phone;
s2: inserting dubbing texts, and inserting dubbing character texts at different positions of the video;
s3: step S2, the text is transmitted to the text-to-speech server section by section, the text-to-speech server generates the dubbing file, the dubbing file is transmitted back to the original text position, and an audio interval is formed;
s4: inserting a blank audio file in the audio interval in the step 3;
s5: synthesizing audio, namely synthesizing the dubbing file and the blank audio file in the step 3 into a synthesized audio file;
s6: decomposing the video file into an original audio file and a new video file;
s7: mixing sound, namely mixing the original audio file and the synthesized audio file to obtain a total audio file;
s8: and mixing the total audio file with the new video file to obtain a synthesized video file.
2. Method for video dubbing based on TTS technology according to claim 1, characterized in that: in step S1, the method for selecting an original video file further includes shooting a video with a mobile phone camera.
3. Method for video dubbing based on TTS technology according to claim 1, characterized in that: in step S2, the dubbing text is inserted by sequentially inserting the start time point and the end time point of the text with the time of the video as the coordinate, and then sequentially labeling the corresponding text in the time array.
4. Method for video dubbing based on TTS technology according to claim 1, characterized in that: in step S2, an identifier of a time interval duration defined by a user is inserted into the text of the dubbing.
5. Method for video dubbing based on TTS technology according to claim 1, characterized in that: in step S3, the speed and pitch of the text-to-speech are set.
6. Method for video dubbing based on TTS technology according to claim 1, characterized in that: in step S7, before mixing, the sound volumes of the original audio file and the synthesized audio file are set.
7. Method for video dubbing based on TTS technology according to claim 1, characterized in that: in step S8, the method further includes converting the text in step S2 into a subtitle file, and incorporating the subtitle file into the composite video.
8. Method for video dubbing based on TTS technology according to claim 7, characterized in that: the text is converted into a subtitle file to set the size, color and background color of the text.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201911042390.8A CN110769167A (en) | 2019-10-30 | 2019-10-30 | Method for video dubbing based on text-to-speech technology |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201911042390.8A CN110769167A (en) | 2019-10-30 | 2019-10-30 | Method for video dubbing based on text-to-speech technology |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN110769167A true CN110769167A (en) | 2020-02-07 |
Family
ID=69334553
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201911042390.8A Pending CN110769167A (en) | 2019-10-30 | 2019-10-30 | Method for video dubbing based on text-to-speech technology |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN110769167A (en) |
Cited By (10)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111491176A (en) * | 2020-04-27 | 2020-08-04 | 百度在线网络技术(北京)有限公司 | Video processing method, device, equipment and storage medium |
| CN111653263A (en) * | 2020-06-12 | 2020-09-11 | 百度在线网络技术(北京)有限公司 | Volume adjusting method and device, electronic equipment and storage medium |
| CN112331223A (en) * | 2020-11-09 | 2021-02-05 | 合肥名阳信息技术有限公司 | Method for adding background music to dubbing |
| CN112397049A (en) * | 2020-11-30 | 2021-02-23 | 长沙神漫文化科技有限公司 | Method for video dubbing based on text-to-speech technology |
| CN112435649A (en) * | 2020-11-09 | 2021-03-02 | 合肥名阳信息技术有限公司 | Multi-user dubbing sound effect mixing method |
| CN112562638A (en) * | 2020-11-26 | 2021-03-26 | 北京达佳互联信息技术有限公司 | Voice preview method and device and electronic equipment |
| CN113411655A (en) * | 2021-05-18 | 2021-09-17 | 北京达佳互联信息技术有限公司 | Method and device for generating video on demand, electronic equipment and storage medium |
| CN114245224A (en) * | 2021-11-19 | 2022-03-25 | 广州坚和网络科技有限公司 | Dubbing video generation method and system based on user input text |
| CN114945075A (en) * | 2022-07-26 | 2022-08-26 | 中广智诚科技(天津)有限公司 | Method and device for synchronizing new dubbing audio contents with video contents |
| CN116229936A (en) * | 2022-12-22 | 2023-06-06 | 中国电建集团河北省电力勘测设计研究院有限公司 | A method for PPT automatic dubbing |
Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN1774715A (en) * | 2003-04-14 | 2006-05-17 | 皇家飞利浦电子股份有限公司 | System and method for performing automatic dubbing on an audio-visual stream |
| CN101189657A (en) * | 2005-05-31 | 2008-05-28 | 皇家飞利浦电子股份有限公司 | A method and device for performing automatic dubbing on a multimedia signal |
| CN101896923A (en) * | 2007-12-13 | 2010-11-24 | 三星电子株式会社 | Device and method for generating multimedia e-mail |
| US20110060590A1 (en) * | 2009-09-10 | 2011-03-10 | Jujitsu Limited | Synthetic speech text-input device and program |
| US20110093608A1 (en) * | 2003-02-05 | 2011-04-21 | Jason Sumler | System, method, and computer readable medium for creating a video clip |
| US20160021334A1 (en) * | 2013-03-11 | 2016-01-21 | Video Dubber Ltd. | Method, Apparatus and System For Regenerating Voice Intonation In Automatically Dubbed Videos |
| CN109274900A (en) * | 2018-09-05 | 2019-01-25 | 浙江工业大学 | Video dubbing method |
| CN109600566A (en) * | 2018-12-03 | 2019-04-09 | 浙江工业大学 | A kind of video dubbing method |
| CN110149548A (en) * | 2018-09-26 | 2019-08-20 | 腾讯科技(深圳)有限公司 | Video dubbing method, electronic device and readable storage medium storing program for executing |
-
2019
- 2019-10-30 CN CN201911042390.8A patent/CN110769167A/en active Pending
Patent Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20110093608A1 (en) * | 2003-02-05 | 2011-04-21 | Jason Sumler | System, method, and computer readable medium for creating a video clip |
| CN1774715A (en) * | 2003-04-14 | 2006-05-17 | 皇家飞利浦电子股份有限公司 | System and method for performing automatic dubbing on an audio-visual stream |
| CN101189657A (en) * | 2005-05-31 | 2008-05-28 | 皇家飞利浦电子股份有限公司 | A method and device for performing automatic dubbing on a multimedia signal |
| CN101896923A (en) * | 2007-12-13 | 2010-11-24 | 三星电子株式会社 | Device and method for generating multimedia e-mail |
| US20110060590A1 (en) * | 2009-09-10 | 2011-03-10 | Jujitsu Limited | Synthetic speech text-input device and program |
| US20160021334A1 (en) * | 2013-03-11 | 2016-01-21 | Video Dubber Ltd. | Method, Apparatus and System For Regenerating Voice Intonation In Automatically Dubbed Videos |
| CN109274900A (en) * | 2018-09-05 | 2019-01-25 | 浙江工业大学 | Video dubbing method |
| CN110149548A (en) * | 2018-09-26 | 2019-08-20 | 腾讯科技(深圳)有限公司 | Video dubbing method, electronic device and readable storage medium storing program for executing |
| CN109600566A (en) * | 2018-12-03 | 2019-04-09 | 浙江工业大学 | A kind of video dubbing method |
Cited By (11)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN111491176A (en) * | 2020-04-27 | 2020-08-04 | 百度在线网络技术(北京)有限公司 | Video processing method, device, equipment and storage medium |
| CN111653263A (en) * | 2020-06-12 | 2020-09-11 | 百度在线网络技术(北京)有限公司 | Volume adjusting method and device, electronic equipment and storage medium |
| CN111653263B (en) * | 2020-06-12 | 2023-03-31 | 百度在线网络技术(北京)有限公司 | Volume adjusting method and device, electronic equipment and storage medium |
| CN112331223A (en) * | 2020-11-09 | 2021-02-05 | 合肥名阳信息技术有限公司 | Method for adding background music to dubbing |
| CN112435649A (en) * | 2020-11-09 | 2021-03-02 | 合肥名阳信息技术有限公司 | Multi-user dubbing sound effect mixing method |
| CN112562638A (en) * | 2020-11-26 | 2021-03-26 | 北京达佳互联信息技术有限公司 | Voice preview method and device and electronic equipment |
| CN112397049A (en) * | 2020-11-30 | 2021-02-23 | 长沙神漫文化科技有限公司 | Method for video dubbing based on text-to-speech technology |
| CN113411655A (en) * | 2021-05-18 | 2021-09-17 | 北京达佳互联信息技术有限公司 | Method and device for generating video on demand, electronic equipment and storage medium |
| CN114245224A (en) * | 2021-11-19 | 2022-03-25 | 广州坚和网络科技有限公司 | Dubbing video generation method and system based on user input text |
| CN114945075A (en) * | 2022-07-26 | 2022-08-26 | 中广智诚科技(天津)有限公司 | Method and device for synchronizing new dubbing audio contents with video contents |
| CN116229936A (en) * | 2022-12-22 | 2023-06-06 | 中国电建集团河北省电力勘测设计研究院有限公司 | A method for PPT automatic dubbing |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN110769167A (en) | Method for video dubbing based on text-to-speech technology | |
| CN104732593B (en) | A kind of 3D animation editing methods based on mobile terminal | |
| JP4344658B2 (en) | Speech synthesizer | |
| US20080275700A1 (en) | Method of and System for Modifying Messages | |
| WO2023011221A1 (en) | Blend shape value output method, storage medium and electronic apparatus | |
| CN103561217A (en) | Method and terminal for generating captions | |
| CN104952471B (en) | A kind of media file synthetic method, device and equipment | |
| CN106572395A (en) | Video processing method and device | |
| CN107222792A (en) | A kind of caption superposition method and device | |
| CN105679348A (en) | Audio and video player and method | |
| WO2023045954A1 (en) | Speech synthesis method and apparatus, electronic device, and readable storage medium | |
| JP2016091057A (en) | Electronic device | |
| CN109274900A (en) | Video dubbing method | |
| CN109753259A (en) | A kind of throwing screen system and control method | |
| CN117111738A (en) | Man-machine interaction method, device, equipment and storage medium | |
| RU2011129330A (en) | METHOD AND DEVICE FOR SPEECH SYNTHESIS | |
| CN112562687A (en) | Audio and video processing method and device, recording pen and storage medium | |
| CN104519403A (en) | Audio control device and method | |
| CN110797003A (en) | Method for displaying caption information by converting text into voice | |
| CN114554246B (en) | UGC mode-based medical science popularization video production method and system | |
| US11651764B2 (en) | Methods and systems for synthesizing speech audio | |
| CN101968894A (en) | Method for automatically realizing sound and lip synchronization through Chinese characters | |
| CN105450970A (en) | Information processing method and electronic equipment | |
| CN109525787B (en) | Live scene oriented real-time subtitle translation and system implementation method | |
| US20240274120A1 (en) | Speech synthesis method and apparatus, electronic device, and readable storage medium |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200207 |