Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure are shown in the drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the disclosure are for illustration purposes only and are not intended to limit the scope of the disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order, and/or performed in parallel. Moreover, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "include" and variations thereof as used herein are open-ended, i.e., "including but not limited to". The term "based on" is "based, at least in part, on". The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments". Relevant definitions for other terms will be given in the following description.
It should be noted that the terms "first", "second", and the like in the present disclosure are only used for distinguishing different devices, modules or units, and are not used for limiting the order or interdependence relationship of the functions performed by the devices, modules or units.
It is noted that references to "a", "an", and "the" modifications in this disclosure are intended to be illustrative rather than limiting, and that those skilled in the art will recognize that "one or more" may be used unless the context clearly dictates otherwise.
The names of messages or information exchanged between devices in the embodiments of the present disclosure are for illustrative purposes only, and are not intended to limit the scope of the messages or information.
In the related art, by integrating the proofreading interface of the main control device for simultaneous interpretation into an external webpage link, an interpreter can modify the original text and the translated text for simultaneous interpretation in the external webpage link. However, the translator has the modification authority for the subtitles through the external webpage link basically consistent with the translator authority of the main control end (both the translator authority can modify the original text and the translated text), which is equivalent to allowing two persons to freely edit the original text and the translated text, so that the results are easily confused, and the complexity and the risk for accurately determining the subtitles are greatly increased.
Fig. 1 is a flowchart of a method for determining simultaneous interpretation of subtitles according to an embodiment of the present disclosure, where the embodiment of the present disclosure is applicable to a case of determining simultaneous interpretation of subtitles in a live audio/video stream, and the method may be performed by a device for determining simultaneous interpretation of subtitles, where the device may be composed of hardware and/or software and may be generally integrated in a device having a function of determining simultaneous interpretation of subtitles, where the device may be an electronic device such as a server, a mobile terminal, or a server cluster. As shown in fig. 1, the method specifically includes the following steps:
and step 110, acquiring a source subtitle matched with the live audio and video stream in real time.
Specifically, the main control end collects live audio and video streams in real time and determines a source subtitle matched with the live audio and video streams. The main control end is an electronic device which can collect live audio and video streams in real time, determine a source subtitle matched with the live audio and video streams and carry out simultaneous interpretation. The live audio and video streams can comprise audio and video data streams collected in various live scenes such as conferences, media activities, broadcast lectures, lectures and the like. Illustratively, the live audio/video stream acquired in real time is an audio/video stream acquired in a live speech process, and the live audio/video stream includes not only audio data sent by a speaker in the speech process but also video data of the speaker in the speech process. The live audio and video stream can also be understood as live audio and video data to be subjected to simultaneous interpretation.
In the embodiment of the present disclosure, the source subtitle matched with the live audio/video stream is text data in the same language as the voice data in the live audio/video stream. Illustratively, the obtaining of the source subtitle matched with the live audio and video stream in real time includes: collecting live audio and video streams in real time, and extracting audio streams from the live audio and video streams; and carrying out voice recognition on the audio stream, and determining a source subtitle matched with the live audio and video stream. Specifically, the audio stream can be directly extracted from the whole audio stream, or the audio stream can be simultaneously extracted from the audio and video stream in a multi-thread manner. Illustratively, an audio stream is extracted from a live audio and video stream, the audio stream is recognized based on an Automatic Speech Recognition (ASR) technology, and a Recognition result is used as a source subtitle matched with the live audio and video stream.
And 120, sending the source subtitle to at least one translator, wherein different translator ends are used for translating the source subtitle into target translation subtitles of different languages.
The translator end is an electronic device for translating the source subtitle, and different translator ends are used for translating the source subtitle into target translation subtitles of different languages. And each interpreter end and the main control end share the same conference ID so as to ensure that the interpreter end and the main control end determine the simultaneous interpretation subtitles of the same live audio and video data stream. And the master control end sends the source subtitles matched with the live audio and video stream acquired in real time to at least one translator end, and each translator end converts the source subtitles into target translation subtitles in a language adapted to the local translator end. Specifically, each translator end can use a translation tool adapted to the local translator end to convert the source subtitles into the target translation subtitles, and send the target translation subtitles to the main control end. For example, the source caption is caption data belonging to a first language, and the target translation caption is caption data belonging to a second language, which may be chinese, english, french, german, korean, etc., where the first language is different from the second language.
And step 130, receiving target translation subtitles fed back by the interpreters aiming at the source subtitles, and adding the source subtitles and the target translation subtitles into the live audio and video stream as simultaneous interpretation subtitles.
In the embodiment of the present disclosure, the master control end receives target translation subtitles fed back by each translator end for the source subtitles, that is, receives target translation subtitles corresponding to the source subtitles sent by the translator end. And taking the source caption and the corresponding target translation caption as simultaneous interpretation captions matched with the live audio and video stream, and overlapping the simultaneous interpretation captions into the live audio and video stream. Optionally, before superimposing the simultaneous interpretation subtitles on the live audio/video stream, the method further includes: acquiring a timestamp of the simultaneous interpretation caption; adding the simultaneous interpretation subtitles into the live audio and video stream, comprising: and adding the simultaneous interpretation subtitles into the live audio and video stream according to the timestamp so as to synchronize the simultaneous interpretation subtitles with the live audio and video stream. Specifically, the timestamp for simultaneous interpretation of the subtitles is the corresponding start-stop time of the subtitle data in the live audio/video stream. Respectively obtaining a time stamp of a source caption and a time stamp of a target translation caption, and correspondingly and synchronously adding the source caption and the target translation caption into a live audio/video stream based on the time stamp of the source caption and the time stamp of the target translation caption. The method comprises the steps that audio streams in the live broadcast audio and video streams are identified, corresponding source subtitles are determined, timestamps of the source subtitles are obtained, a main control end sends the source subtitles and the timestamps of the source subtitles to each interpreter end, and the interpreter ends determine timestamps of target translated subtitles based on the timestamps of the source subtitles in the process of translating the source subtitles into the target translated subtitles. Specifically, according to the timestamp of the source subtitle, the timestamp of the target translation subtitle and the timestamp of the live audio/video stream, the simultaneous interpretation subtitle is added to the corresponding live audio/video stream, so that the display time of the simultaneous interpretation subtitle is aligned with or synchronous with the playing time of the live audio/video stream.
Optionally, superimposing the simultaneous interpretation subtitles onto the live audio/video stream includes: and overlapping the target translation subtitles and the source subtitles to the live audio and video stream according to a mode that the target translation subtitles and the source subtitles are in an up-down corresponding relation. The advantage of setting up like this lies in, can be with the caption data (target translation caption) that spectator's language corresponds last, the caption data (source caption) that the speaker language corresponds is down to be the mode that corresponds and show, the key is outstanding, and is simple clear, can improve user's reading experience greatly.
Illustratively, the language to which the source subtitle belongs is a language to which voice information of the live audio/video stream belongs, that is, the source subtitle data is text information corresponding to a language of a speaker in the live audio/video stream, and the target translation subtitle is text information corresponding to a language used by a user when the user views a live audio/video stream that is interpreted in the same way. Therefore, the target translation subtitles and the source subtitles are superposed into the live audio and video stream in a vertically corresponding relationship, and it can be understood that when the live audio and video stream with simultaneous interpretation subtitle data is played on an audio and video live broadcast picture, the target translation subtitles are displayed at the upper position of the source subtitles, and the target translation subtitles and the source subtitles are in one-to-one correspondence.
According to the embodiment of the disclosure, a source subtitle matched with a live audio and video stream is acquired in real time; the source subtitle is sent to at least one translator end, and different translator ends are used for translating the source subtitle into target translation subtitles of different languages; and receiving target translation subtitles fed back by the translator end aiming at the source subtitles, and adding the source subtitles and the target translation subtitles into the live audio and video stream as simultaneous interpretation subtitles. According to the method for determining the simultaneous interpretation subtitles, the main control end obtains the source subtitle data matched with the live audio and video stream in real time, and the at least one interpreter end independently arranged with the main control end obtains the target translation subtitles in different languages, so that the simultaneous interpretation subtitles can be accurately, quickly and orderly obtained, the translation subtitles in multiple languages can be simultaneously obtained, and the real-time performance of simultaneous interpretation is effectively guaranteed.
In some embodiments, before sending the source subtitle to at least one interpreter, the method further includes: and locally displaying the source caption, and responding to the received first correction information matched with the displayed source caption to correct the source caption. The method has the advantages that the accuracy of the source subtitles can be effectively guaranteed, and therefore the accuracy of the target translated subtitles converted from the source subtitles by the translator end is further guaranteed.
For example, when the speaker pronounces abnormally or the voice data contains professional term vocabularies, the source caption obtained by performing voice recognition on the live broadcast video stream may be inaccurate, and thus the translator may also generate inaccurate target translation captions after converting the source caption. Therefore, the source subtitles matched with the live audio and video streams are acquired in real time and are locally displayed, when an interpreter of the main control end considers that the source subtitles are inaccurate, correction information can be input aiming at the source subtitles to correct the source subtitles, and the main control end responds to the received first correction information matched with the displayed source subtitles to correct the source subtitles. And the main control end sends the modified source subtitle to at least one translator end so that the translator end generates a target translation subtitle according to the modified source subtitle.
In some embodiments, after acquiring the source subtitle matched with the live audio and video stream in real time, the method further includes: segmenting the source subtitles according to a preset mode to generate at least one sub-source subtitle; locally displaying the source caption, and responding to received first correction information matched with the displayed source caption to correct the source caption, wherein the method comprises the following steps: and locally displaying the sub-source subtitles item by item, and responding to the received first correction information matched with at least one displayed target sub-source subtitle to correct the target sub-source subtitle. The method has the advantages that the translator of the main control end can modify the source subtitles one by one, and the efficiency of modifying the source subtitles is improved.
Specifically, the source subtitles corresponding to the obtained live audio/video stream may be very long, and an interpreter of the main control end cannot quickly judge whether the source subtitles are accurate or not for the lengthy source subtitles, and cannot quickly and accurately correct the source subtitles. Therefore, the source subtitles can be segmented according to a preset mode to generate at least one sub-source subtitle, so that each sub-source subtitle is displayed in a sentence form and has a moderate length. Optionally, the segmenting the source subtitles according to a preset mode includes: segmenting the source subtitle based on a knowledge graph; or segmenting the source subtitles based on the preset number of characters, so that each sub-source subtitle comprises the characters with the preset number of characters. The method for segmenting the source subtitles based on the knowledge graph is to segment the proper length of the sub-source subtitles generated by segmentation as much as possible into the same sub-source subtitles. The segmentation of the source subtitles based on the preset number of characters is to enable each set of segmented sub-source subtitles to contain the characters with the preset number of characters, so that the sub-source subtitles obtained through segmentation are fixed and moderate in length. Optionally, the segmenting the source subtitles according to a preset mode includes: and in the process of determining the source subtitles corresponding to the live audio and video stream based on the live audio and video stream, segmenting the source subtitles based on a Voice Activity Detection (VAD) mode. Specifically, in the process of determining the source subtitle corresponding to the live audio and video stream based on the audio stream in the live audio and video stream, the source subtitle is segmented based on a VAD mode, audio data in the live audio and video stream is segmented according to speaking time intervals, namely, the position of a sentence to be broken is judged through voice recognition according to a oscillogram of audio in the audio stream, so that the source subtitle corresponding to the whole live audio and video stream is segmented into at least one sub-source subtitle.
In the embodiment of the disclosure, the sub-source subtitles are locally displayed item by item, that is, the sub-source subtitles displayed in terms of sentence and with a moderate length are displayed in the display interface item by item, and an interpreter at the main control end can modify each sub-source subtitle item by item. The interpreter of the main control end can correct one or more sub-source subtitles in the sub-source subtitles displayed one by one, and specifically, the translator responds to received first correction information matched with at least one displayed target sub-source subtitle to correct the target sub-source subtitles, wherein the target sub-source subtitles are the sub-source subtitles to be corrected.
Optionally, locally displaying the sub-source subtitles item by item, including: displaying a current sub-source caption in a display page, and providing a switching option for a previous caption and a next caption of the current sub-source caption in the display page; and responding to the received selection of the switching option, acquiring the switching sub-source caption to be displayed in a display page, and moving an editing cursor to a set position in the switching sub-source caption. The advantage of this arrangement is that the sub-source caption currently displayed on the display page can be quickly switched according to the switching option, thereby facilitating quick correction of the sub-source caption.
Illustratively, fig. 2 is a schematic diagram of a subtitle interface for displaying a page display sub-source in an embodiment of the present disclosure. As shown in fig. 2, a current sub-source subtitle is displayed in a display page, a switching option (↓) of a previous subtitle and a switching option (↓) of a next subtitle of the current sub-source subtitle are provided in the display page, a translator at the main control end can quickly switch the currently displayed sub-source subtitle to the previous sub-source subtitle by clicking the switching option of the previous subtitle, and the translator at the main control end can quickly switch the currently displayed sub-source subtitle to the next sub-source subtitle by clicking the switching option of the next subtitle. Responding to the received switching option of the translator of the main control end to the previous subtitle or the next subtitle, acquiring a switching sub-source subtitle (wherein the switching sub-source subtitle is the previous sub-source subtitle of the currently displayed sub-source subtitle or the next sub-source subtitle of the currently displayed sub-source subtitle), and displaying the switching sub-source subtitle on a display interface, so that the translator of the main control end can conveniently judge whether the switching sub-source subtitle is accurate. Specifically, when the switching sub-source subtitles are displayed on a display page, the editing cursor is moved to a set position in the switching sub-source subtitles, wherein the set position can be an initial position of the switching sub-source subtitles, an end position of the switching sub-source subtitles, or any middle position of the switching sub-source subtitles. The advantage of setting up like this is, can make the interpreter of master control end input the correction information to switching the subtitle of the sub source, realize the quick correction to the source subtitle.
Optionally, different switching options correspond to different shortcut keys. The method has the advantages that the translator of the main control end can realize the quick switching of the currently displayed sub-source subtitles in a mode of operating a shortcut key, and therefore the method is beneficial to improving the correction speed of the sub-source subtitles. For example, the shortcut key corresponding to the switching option of the previous subtitle of the current sub-source subtitle may be a Shift key + Tab key, and the shortcut key corresponding to the switching option of the next subtitle of the current sub-source subtitle may be a Tab key. Specifically, when the main control end detects that the Shift key and the Tab key are triggered, the currently displayed sub-source caption of the display page can be quickly switched to the previous sub-source caption, that is, the previous sub-source caption of the currently displayed sub-source caption is displayed in the display page; when the main control end detects that the Tab key is triggered, the sub-source caption currently displayed on the display page can be quickly switched to the next sub-source caption, namely, the next sub-source caption of the currently displayed sub-source caption is displayed on the display page.
In some embodiments, in response to receiving first correction information matching the displayed at least one target sub-source subtitle, correcting the target sub-source subtitle comprises: responding to the received shortcut key full selection operation of the user, and performing full selection on the displayed current sub-source subtitle to be used as a target sub-source subtitle; and responding to the received first correction information matched with the target sub-source caption, and performing overall correction on the target sub-source caption. The setting has the advantage that the translator at the main control end can conveniently and directly carry out comprehensive and rapid modification on the sub-source subtitle. Illustratively, the shortcut key for the full selection operation may be a Backspace key, and when it is detected that the shortcut key is triggered, the full selection operation is performed on the current sub-source subtitle currently displayed on the display page, that is, a user (a translator of the main control end) is facilitated to directly input a correct source subtitle, and the method is applicable to an application scenario in which a difference between the determined sub-source subtitle and the real voice content is large. And receiving first correction information (namely correct sub-source subtitles input by a user again) corresponding to the fully selected target sub-source subtitles, and performing overall correction on the target sub-source subtitles.
In some embodiments, after generating the at least one sub-source caption, further comprising: determining the number of characters contained in each sub-source caption; when the sub-source subtitles are locally displayed item by item, the method further comprises the following steps: and locally displaying the number of the characters of each sub-source caption, and reminding a user of the abnormal sub-source caption when the number of the characters of the abnormal sub-source caption exceeds a preset first number threshold. The advantage of the setting can make an interpreter of the main control end clearly know whether the length of each sub-source caption generated by segmentation is proper or not.
Illustratively, after the source subtitles are segmented, the number of characters contained in each sub-source subtitle is obtained, the sub-source subtitles are locally displayed one by one, and meanwhile, the number of characters of each sub-source subtitle is locally displayed, so that an interpreter at a main control end can clearly know the length of each sub-source subtitle. When the number of the characters of the sub-source caption exceeds a preset first number threshold, the sub-source caption is longer, which is not convenient for a translator at the main control end to quickly correct the sub-source caption, and the sub-source caption is not attractive enough for display and is also not convenient for a user to read due to the longer length when an audio and video live broadcast picture is played. Therefore, the sub-source subtitles with the number of characters exceeding the preset first number threshold can be determined as the abnormal sub-source subtitles, and reminding operation is performed on the abnormal sub-source subtitles, such as 'the sub-source subtitles are longer in length and please increase the correction speed'.
In some embodiments, sending the source subtitle to at least one interpreter, comprises: and sending each sub-source caption to each translator end one by one so as to generate sub-target translation captions matched with the sub-source captions one by one through each translator end. The source subtitles are segmented at the main control end to generate at least one sub-source subtitle, so that when the main control end sends the source subtitles to at least one translator end, each sub-source subtitle can be sent to the translator end one by one, the translator end can obtain sub-target translation subtitles matched with the sub-source subtitles one by one, the situation that interference of the accuracy of conversion of the source subtitles is easily caused by the context due to the fact that the source subtitles are long in length in the process of converting the source subtitles into the target translation subtitles can be effectively avoided, and the speed and the accuracy of obtaining the target translation subtitles can be improved.
In some embodiments, after adding the source subtitles and the target translated subtitles as simultaneous translation subtitles to the live audio and video stream, the method further includes: playing the target audio and video stream; and the target audio and video stream is a live audio and video stream with the simultaneous interpretation subtitles. In the embodiment of the present disclosure, the target audio/video stream is played in the audio/video playing interface. It can be understood that, when watching a target audio/video stream played by the audio/video playing interface, a user can not only see a video picture in a live audio/video stream and hear voice information in the live audio/video stream, but also see subtitle data synchronously displayed with the voice information in the live audio/video stream.
In some embodiments, in the process of playing the target audio-video stream, the method further includes: acquiring a currently played sub-source subtitle from a target audio and video stream in real time, and acquiring a playing number of the currently played sub-source subtitle in all played sub-source subtitles; locally displaying the playing number, and sending the playing number to the at least one interpreter; after generating at least one sub-source caption, further comprising: determining the target number of each sub-source caption in the whole source caption; when the sub-source subtitles are locally displayed item by item, the method further comprises the following steps: and locally displaying the target number of each sub-source caption, and sending the target number of the sub-source caption to the at least one interpreter. The advantage of setting up like this lies in, can make the interpreter of master control end in time adjust the modification progress of pair sub source caption according to the broadcast progress of sub source caption, not only can effectively avoid the invalid modification of pair sub source caption, can also guarantee the accuracy of the sub source caption as much as possible. In addition, the translator at the translator end can adjust the modification progress of the sub-target translation subtitles in time according to the playing progress of the sub-target subtitles, so that the invalid modification of the sub-target translation subtitles can be effectively avoided, and the accuracy of the sub-target translation subtitles can be ensured as much as possible.
In the embodiment of the present disclosure, when the target audio/video stream is played, the source subtitles in the target audio/video stream are played one by one in the form of the sub-source subtitles, so that, in the process of playing the target audio/video stream, the currently played sub-source subtitles can be obtained in the target audio/video stream in real time, and the playing numbers of the currently played sub-source subtitles in all the played sub-source subtitles, that is, the currently played sub-source subtitles are determined. For example, a source subtitle matched with a live audio/video stream is divided into 10 sub-source subtitles, a 5 th sub-source subtitle is currently played in a target audio/video stream, and the playing number of the currently played sub-source subtitle in all the played sub-source subtitles is 5. Illustratively, the playing number is displayed locally, and the playing number is sent to the interpreter side. Correspondingly, the target number of each sub-source caption in the whole sub-source caption is determined, the target number of each sub-source caption is locally displayed, and the target number of the sub-source caption is sent to the at least one interpreter. An interpreter of the main control end can clearly know the current sub-source caption, and the modification progress of the sub-source caption is adjusted according to the playing number and the target number of each sub-source caption. For example, the number of the sub-source caption is 5, the interpreter of the master control end is modifying the sub-source caption with the target number of 4, obviously, the sub-source caption with the number of 4 is completely played, at this time, the modification operation on the sub-source caption with the target number of 4 is invalid modification, and the user can adjust the modification progress of the sub-source caption, for example, the modification operation on the sub-source caption with the number of 4 is skipped, and the modification operation on the sub-source caption with the number of 7 is performed. In addition, an interpreter at an interpreter end can also clearly know the currently played sub-source caption, and because the sub-target translation captions and the sub-source captions are in one-to-one correspondence, the interpreter at the interpreter end can also adjust the modification progress of the sub-target translation captions corresponding to the sub-source captions according to the playing number and the target number of each sub-source caption.
In some embodiments, obtaining a source subtitle matched with a live audio and video stream in real time includes: acquiring live audio and video streams in real time, and caching the live audio and video streams based on preset delay time; acquiring a source subtitle matched with the live audio and video stream within the preset delay time; playing the target audio and video stream, comprising: and when the preset delay time is reached, playing the target audio and video stream. The method has the advantages that live audio and video streams acquired in real time are cached based on the preset delay time, the simultaneous interpretation subtitles corresponding to the live audio and video streams are accurately determined within the preset delay time, the live audio and video streams with the simultaneous interpretation subtitles are played when the delay time is over, the accuracy of subtitle determination and the stability of display of the subtitles in the live audio and video streams can be effectively guaranteed, live sound and picture synchronization is achieved, and user experience is greatly improved.
The preset delay time may be preset according to an actual requirement, and may be 1min, for example. Of course, the user can adjust the preset delay time at any time according to specific conditions, and the expected effect is achieved. For example, the preset delay time may be set according to a live scene of a live audio/video stream, where the live scenes of the live audio/video stream are different, the corresponding preset delay time may be different, the higher the real-time requirement on the live broadcast is, the shorter the corresponding preset delay time is, and otherwise, the lower the real-time requirement on the live broadcast is, the longer the corresponding preset delay time is. For example, if the live audio/video stream is audio/video data acquired by live broadcasting a large football game, the preset delay time may be set to 10s, and if the live audio/video stream is audio/video data acquired by live broadcasting an academic conference, the preset delay time may be set to 5 min. Within the preset time delay, the manual work has enough time to carry out more comprehensive proofreading and correction on the simultaneous interpretation captions, so that the strong readability and the accuracy of the captions displayed finally are fully ensured, and the level of 'manual captions' is reached. When the preset delay time length is over, the cached target audio and video stream is played, and because the subtitle data corresponding to the live audio and video stream is stable, the technical problems that in the prior art, the subtitle is displayed in a typewriter mode, the subtitle jumps greatly in a live broadcast picture due to instability, the vision of a user is difficult to focus, the user is easy to have visual fatigue, the retention time of the subtitle content is short, and the actual reading experience of audiences is poor are effectively solved.
Fig. 3 is a flowchart of a method for determining simultaneous interpretation of subtitles according to another embodiment of the present disclosure, as shown in fig. 3, the method includes the following steps:
and 310, acquiring a source subtitle matched with the live audio and video stream in real time.
And 320, segmenting the source subtitles according to a preset mode to generate at least one sub-source subtitle.
Step 330, determining the number of characters contained in each sub-source caption.
And 340, locally displaying the sub-source subtitles and the number of the characters of the sub-source subtitles one by one.
Optionally, locally displaying the sub-source subtitles item by item, including: displaying a current sub-source caption in a display page, and providing a switching option for a previous caption and a next caption of the current sub-source caption in the display page; and responding to the received selection of the switching option, acquiring the switching sub-source caption to be displayed in a display page, and moving an editing cursor to a set position in the switching sub-source caption.
Optionally, different switching options correspond to different shortcut keys.
For example, fig. 4 is a schematic interface diagram illustrating the number of characters of the sub-source subtitles and the sub-source subtitles on the presentation page in the embodiment of the disclosure.
And 350, when the number of the characters of the abnormal sub-source subtitles exceeds a preset first number threshold, carrying out user reminding on the abnormal sub-source subtitles.
And step 360, responding to the received first correction information matched with the displayed at least one target sub-source caption, and correcting the target sub-source caption.
Optionally, in response to the received first correction information matched with the displayed at least one target sub-source subtitle, correcting the target sub-source subtitle includes: responding to the received shortcut key full selection operation of the user, and performing full selection on the displayed current sub-source subtitle to be used as a target sub-source subtitle; and responding to the received first correction information matched with the target sub-source caption, and performing overall correction on the target sub-source caption.
And 370, sending each sub-source caption to each translator end one by one to generate sub-target translation captions matched with the sub-source captions one by one through each translator end.
And 380, receiving the sub-target translation subtitles fed back by each interpreter aiming at each source subtitle, and adding each sub-source subtitle and each sub-target translation subtitle into the live audio and video stream as simultaneous translation subtitles.
The method for determining simultaneous interpretation subtitles provided by the embodiment of the disclosure can enable an interpreter at a main control end to modify the source subtitles item by item, improves the efficiency of modifying the source subtitles, and simultaneously sends the sub-source subtitles item by item to the interpreter end, which is helpful for the interpreter end to accurately and quickly determine the sub-target translation subtitles matched with the sub-source subtitles. In addition, the number of the characters of each sub-source caption is displayed on the display page of the main control end, so that an interpreter of the main control end can clearly know whether the length of each sub-source caption generated by segmentation is proper or not.
Fig. 5 is a flowchart of a method for determining simultaneous interpretation of subtitles according to another embodiment of the present disclosure, as shown in fig. 5, the method includes the following steps:
and step 510, acquiring a source subtitle matched with the live audio and video stream in real time.
And 520, segmenting the source subtitles according to a preset mode to generate at least one sub-source subtitle.
Step 530, determining the target number of each sub-source caption in the whole sub-source caption.
And 540, determining the number of characters contained in each sub-source caption.
And 550, locally displaying the sub-source subtitles, the number of the characters of the sub-source subtitles and the target number of the sub-source subtitles in the whole source subtitles one by one.
Optionally, locally displaying the sub-source subtitles item by item, including: displaying a current sub-source caption in a display page, and providing a switching option for a previous caption and a next caption of the current sub-source caption in the display page; and responding to the received selection of the switching option, acquiring the switching sub-source caption to be displayed in a display page, and moving an editing cursor to a set position in the switching sub-source caption.
Optionally, different switching options correspond to different shortcut keys.
And 560, when the number of the characters of the abnormal sub-source subtitles exceeds a preset first number threshold, carrying out user reminding on the abnormal sub-source subtitles.
Step 570, in response to the received first correction information matching the displayed at least one target sub-source subtitle, correcting the target sub-source subtitle.
Optionally, in response to the received first correction information matched with the displayed at least one target sub-source subtitle, correcting the target sub-source subtitle includes: responding to the received shortcut key full selection operation of the user, and performing full selection on the displayed current sub-source subtitle to be used as a target sub-source subtitle; and responding to the received first correction information matched with the target sub-source caption, and performing overall correction on the target sub-source caption.
And 580, sending the target numbers of the sub-source subtitles and the sub-source subtitles to each translator end one by one so as to generate sub-target translation subtitles matched with the sub-source subtitles one by one through each translator end.
Step 590, receiving the sub-target translation subtitles fed back by each interpreter aiming at each source subtitle, and adding each sub-source subtitle and each sub-target translation subtitle into the live audio/video stream as a simultaneous interpretation subtitle.
In the step 5100, target audio and video stream is played; the target audio and video stream is a live audio and video stream with simultaneous interpretation subtitles.
Step 5110, in the process of playing the target audio/video stream, acquiring the currently played sub-source subtitles from the target audio/video stream in real time, and acquiring the playing numbers of the currently played sub-source subtitles in all the played sub-source subtitles.
Step 5120, locally display the playing number, and send the playing number to at least one interpreter.
Illustratively, fig. 6 is an interface schematic diagram illustrating the number of characters of the sub-source subtitles and the sub-source subtitles, the target number of the sub-source subtitles in the whole sub-source subtitles, and the playing number of the currently played sub-source subtitles in all the played sub-source subtitles on the display page in the embodiment of the present disclosure.
The method for determining simultaneous interpretation subtitles provided by the embodiment of the disclosure can enable an interpreter at a main control end to modify the source subtitles item by item, improves the efficiency of modifying the source subtitles, and simultaneously sends the sub-source subtitles item by item to the interpreter end, which is helpful for the interpreter end to accurately and quickly determine the sub-target translation subtitles matched with the sub-source subtitles. In addition, the number of characters of the sub-source subtitles, the target number of the sub-source subtitles and the playing number of the currently played sub-source subtitles are displayed on the display page of the main control end, so that a translator of the main control end can clearly know whether the length of each sub-source subtitle generated by segmentation is proper or not, the translator of the main control end can timely adjust the modification progress of the sub-source subtitles according to the playing progress of the sub-source subtitles, and meanwhile, the translator of the translator end can timely adjust the modification progress of the sub-target translated subtitles according to the playing progress of the sub-source subtitles.
Fig. 7 is a flowchart of a method for determining simultaneous interpretation of subtitles according to an embodiment of the present disclosure, where the embodiment of the present disclosure is applicable to a case of determining simultaneous interpretation of subtitles in a live audio/video stream, and the method may be performed by a device for determining simultaneous interpretation of subtitles, where the device may be composed of hardware and/or software and may be generally integrated in a device having a function of determining simultaneous interpretation of subtitles, where the device may be an electronic device such as a server, a mobile terminal, or a server cluster. As shown in fig. 7, the method specifically includes the following steps:
and step 710, acquiring a source subtitle which is sent by the main control end and matched with the live audio and video stream in real time.
In the embodiment of the disclosure, the translator receives the source caption sent by the main control end in real time, wherein the source caption is a caption matched with the live audio and video stream collected by the main control end in real time.
And 720, converting the source subtitles into target translation subtitles of the language adapted by the local translator.
Illustratively, the translator end translates the source subtitle into the target translation subtitle based on the machine translation technology, wherein the target translation subtitle belongs to the language adapted by the local translator end.
Optionally, the converting the source subtitle into a target translation subtitle in a language adapted by a local translator includes: converting the source subtitle into an original translation subtitle by using a translation tool adapted by a local translator; and performing local contrast display on the original translation subtitles and the source subtitles, responding to received second correction information matched with the displayed original translation subtitles, correcting the original translation subtitles, and generating the target translation subtitles. The advantage of this setting is that the accuracy of the target translation caption can be effectively ensured.
In the embodiment of the present disclosure, when the pronunciation of the speaker is not standard or the voice data contains professional term vocabulary, it may cause inaccuracy of the source caption obtained by performing voice recognition on the live broadcast video stream, which may cause inaccuracy of the target translation caption generated after the translator converts the source caption, or may cause inaccuracy of the target translation caption generated based on the machine translation technology due to an error in the machine translation technology. Therefore, based on the translation tool adapted by the local translator, the translated caption translated by the source caption is used as the original translated caption, and the original translated caption and the source caption are locally contrasted and displayed, so that the translator at the translator end can clearly know whether the original translated caption is accurate or not. When the translator of the translator end considers that the original translated caption is inaccurate, correction information can be input aiming at the original translated caption to correct the original translated caption, the translator end responds to the received second correction information matched with the displayed original translated caption to correct the original translated caption, and the corrected original translated caption is used as a target translated caption.
And 730, sending the target translation subtitles to the main control end so that the main control end adds the source subtitles and the target translation subtitles sent by each translator end into the live audio and video stream as simultaneous interpretation subtitles.
In the embodiment of the disclosure, the translator end sends the target translation caption to the main control end, and the main control end takes the source caption and the target translation caption as simultaneous interpretation captions and adds the simultaneous interpretation captions into the live audio and video stream.
According to the embodiment of the disclosure, a source subtitle which is sent by a main control end and is matched with a live audio and video stream is obtained in real time; converting the source subtitles into target translation subtitles of a language adapted by a local translator; and sending the target translation subtitles to the main control end so that the main control end adds the source subtitles and the target translation subtitles sent by each translator end into the live audio and video stream as simultaneous interpretation subtitles. According to the method for determining the simultaneous interpretation subtitles, the interpreter side and the main control side are independently arranged, the interpreter side converts the source subtitles acquired from the main control side in real time into the target translation subtitles adapted to the local interpreter side and sends the target translation subtitles to the main control side, and therefore the technical problems that in the prior art, an interpreter can modify the original text and the translated text which are simultaneously interpreted in an external webpage link, results are easily disordered, complexity and risk of subtitles are accurately determined are increased can be solved, the main control side can acquire the simultaneous interpretation subtitles accurately, quickly and orderly, and real-time performance of simultaneous interpretation is effectively guaranteed.
Fig. 8 is a flowchart of a method for determining simultaneous interpretation of subtitles according to another embodiment of the present disclosure, as shown in fig. 8, the method includes the following steps:
and 810, acquiring a plurality of sub-source subtitles which are sent by the main control end one by one and correspond to the live audio and video stream in real time.
In the embodiment of the disclosure, each sub-source subtitle is a sub-subtitle generated by segmenting a source subtitle matched with a live audio/video stream by a main control end, and an interpreter end acquires a plurality of sub-source subtitles sent by the main control end in real time.
And step 820, converting each sub-source caption into a sub-original translation caption by using a translation tool adapted by the local translator.
In the embodiment of the disclosure, each sub-source subtitle received in real time is translated based on a translation tool adapted to a local translator, so as to generate a sub-original translation subtitle corresponding to the sub-source subtitle.
And 830, performing local comparison display on each sub original translation subtitle and the matched sub source subtitle.
Optionally, performing local comparison display on each sub-original translated subtitle and the matched sub-source subtitle, including: the method comprises the steps that a current sub-original translation subtitle and the matched sub-source subtitle are displayed in a display page in a contrasting mode, and the switching option of the previous subtitle and the next subtitle of the current sub-original translation subtitle is provided on the display page; responding to the received selection of the switching option, acquiring switching sub original translation subtitles and matched sub source subtitles to perform contrast display in a display page, and moving an editing cursor to a set position in the switching sub original translation subtitles; and the switching option switches the current sub-original translated caption and the corresponding sub-source caption in an associated manner. The method has the advantages that the sub-original translation subtitles currently displayed on the display page and the corresponding sub-source subtitles can be quickly switched in an associated mode according to the switching options, so that an interpreter at an interpreter side can quickly correct the sub-original translation subtitles according to the sub-source subtitles.
Illustratively, fig. 9 is an interface schematic diagram of a sub-original translated subtitle and a matched sub-source subtitle being displayed in contrast on a display page in the embodiment of the present disclosure. As shown in fig. 9, the current sub-original translated caption and the matched sub-source caption are displayed in the display page in a contrasting manner, and the display page has a switching option of a previous caption and a switching option of a next caption of the current sub-original translated caption, where the switching option performs associated switching on the current sub-original caption and the matched sub-source caption. Specifically, a translator at a translator end can quickly switch a currently displayed sub-original translation subtitle to a previous sub-original translation subtitle by clicking a switching option of the previous subtitle, and simultaneously switch a sub-source subtitle matched with the currently displayed sub-original translation subtitle to the previous sub-source subtitle; and a translator at the translator end can quickly switch the currently displayed sub-original translated caption to the next sub-original translated caption by clicking the switching option of the next caption, and simultaneously switch the sub-source caption matched with the currently displayed sub-original translated caption to the next sub-source caption. The method comprises the steps of responding to a received switching option of a translator end for a previous subtitle or a next subtitle, obtaining a switching sub original translation subtitle and a sub source subtitle matched with the switching sub original translation subtitle (wherein the switching sub original translation subtitle is the previous sub original translation subtitle of the currently displayed sub original translation subtitle or the next sub original translation subtitle of the currently displayed sub original translation subtitle), and performing comparison display on the switching sub original translation subtitle and the matched sub source subtitle on a display interface, so that the translator of the translator end can conveniently judge whether the switching sub original translation subtitle is accurate. Specifically, while the switching sub original translated captions and the matched sub source captions are displayed in a display page in a contrasting manner, the editing cursor is moved to a set position in the switching sub original translated captions, wherein the set position can be the starting position of the switching sub original translated captions, the ending position of the switching sub original translated captions or any middle position of the switching sub original translated captions. Optionally, the interpreter disables the editing function for the sub-source subtitles, that is, the sub-source subtitles are not editable at the interpreter. The method has the advantages that the translator at the translator end can input the correction information aiming at the switching of the original translated captions, the one-by-one correction of the original translated captions is realized, and the speed and the accuracy of the correction of the original translated captions can be effectively improved.
Optionally, different switching options correspond to different shortcut keys. The method has the advantages that the translator at the translator end can realize the quick switching of the currently displayed sub-original translated captions by operating the shortcut key, so that the correction speed of the original translated captions is improved. For example, the shortcut key corresponding to the switching option of the last subtitle of the current sub-original translated subtitle may be a Shift key + Tab key, and the shortcut key corresponding to the switching option of the next subtitle of the current sub-original translated subtitle may be a Tab key. Specifically, when the translator end detects that the Shift key and the Tab key are triggered, the currently displayed sub-original translated captions of the display page can be quickly switched to the previous sub-original translated captions, and meanwhile, the sub-source captions matched with the currently displayed sub-original translated captions are also quickly switched to the previous sub-source captions, namely the previous sub-original translated captions of the currently displayed sub-original translated captions and the matched sub-source captions are contrastingly displayed in the display page; when the translator end detects that the Tab key is triggered, the sub-original translated caption currently displayed on the display page can be quickly switched to the next sub-original translated caption, and meanwhile, the sub-source caption matched with the currently displayed sub-original translated caption is also quickly switched to the next sub-source caption, namely, the next sub-original translated caption of the currently displayed sub-original translated caption and the matched sub-source caption are displayed in the display page in a contrasting manner.
Step 840, responding to the received second correction information matched with the displayed at least one target sub original translation caption, correcting the target sub original translation caption to generate a target translation caption.
The target sub-original translated caption is a sub-original translated caption to be corrected, and the number of the target sub-original translated captions can be one or multiple. Specifically, second correction information matched with at least one displayed target sub-original caption is obtained, and the target sub-original translated caption is corrected.
Optionally, in response to the received second correction information matched with the displayed at least one target sub-original translated caption, correcting the target sub-original translated caption, including: responding to the received shortcut key full selection operation of the user, and performing full selection on the displayed current sub original translation subtitle to be used as a target sub original translation subtitle; and responding to the received second correction information matched with the target sub original translation caption, and performing overall correction on the target sub original translation caption. Illustratively, the shortcut key for the full selection operation may be a Backspace key, and when it is detected that the shortcut key is triggered, the full selection operation is performed on the current sub-original translated caption currently displayed on the display page, that is, a user (a translator at a translator end) is facilitated to directly input a correct translated caption, and the method is applicable to an application scenario in which the difference between the determined sub-original translated caption and the real speech content or caption sub-source is large. And receiving second correction information (namely correct sub original translated captions input again by the user) corresponding to the fully selected target sub original translated captions, and performing overall correction on the target sub original translated captions. The method has the advantage that the translator at the translator end can conveniently and directly carry out comprehensive and rapid modification on the sub-original translated captions.
And 850, sending the target translation subtitles to the main control end so that the main control end adds the source subtitles and the target translation subtitles sent by each translator end into the live audio and video stream as simultaneous interpretation subtitles.
The method for determining simultaneous interpretation subtitles, provided by the embodiment of the disclosure, can enable an interpreter at an interpreter side to correct the sub-original translation subtitles item by item, improve the efficiency of correcting the sub-original translation subtitles, and help the interpreter side to accurately and quickly determine the sub-target translation subtitles matched with the sub-source subtitles.
Fig. 10 is a flowchart of a method for determining simultaneous interpretation of subtitles according to another embodiment of the present disclosure, as shown in fig. 10, the method includes the following steps:
step 1010, acquiring a plurality of sub-source subtitles which are sent by the main control end one by one and correspond to the live audio and video stream in real time.
And step 1020, converting each sub-source subtitle into a sub-original translation subtitle by using a translation tool adapted to the local translator.
And step 1030, determining the target number of each sub original translation subtitle in the whole original translation subtitle.
Step 1040, determining the number of characters contained in each sub-original translation caption.
Illustratively, after each sub-source caption is converted into a sub-original translation caption, the number of characters contained in each sub-original translation caption is counted.
1050, locally contrasting and displaying each sub original translation subtitle and the matched sub source subtitle, and locally displaying the target number of each sub original translation subtitle and the number of characters contained in each sub original translation subtitle; and each sub-original translation caption and the matched sub-source caption share the same target number.
Optionally, performing local comparison display on each sub-original translated subtitle and the matched sub-source subtitle, including: the method comprises the steps that a current sub-original translation subtitle and the matched sub-source subtitle are displayed in a display page in a contrasting mode, and the switching option of the previous subtitle and the next subtitle of the current sub-original translation subtitle is provided on the display page; responding to the received selection of the switching option, acquiring switching sub original translation subtitles and matched sub source subtitles to perform contrast display in a display page, and moving an editing cursor to a set position in the switching sub original translation subtitles; and the switching option switches the current sub-original translated caption and the corresponding sub-source caption in an associated manner.
Optionally, the target number of each sub-source subtitle in the whole source subtitle sent by the main control end may be obtained, and the target number of each sub-source subtitle is used as the target number of the sub-original translation subtitle matched with the sub-source subtitle. Optionally, the translator receives each sub-source caption sent by the main control end one by one, and numbers each sub-original translation caption when each sub-source caption is converted into a sub-original translation caption. It should be noted that each sub-original translated subtitle and the matched sub-source subtitle share the same target number. The target codes of the original sub-translated captions are locally displayed, so that an interpreter at an interpreter end can clearly know which original sub-translated caption is currently displayed, and can clearly know which original sub-translated caption is specifically corrected when the original sub-translated caption is subsequently corrected.
And 1060, when the number of the characters of the abnormal sub original translation subtitles exceeds a preset second number threshold, carrying out user reminding on the abnormal sub original translation subtitles.
In the embodiment of the disclosure, when the sub-original translated captions and the matched sub-source captions are displayed in a local contrast mode, the number of the characters of the sub-original translated captions is displayed locally, so that an interpreter at an interpreter can clearly know the length of each sub-original translated caption. When the number of the characters of the sub-original translated caption exceeds a preset second number threshold, the sub-original translated caption is longer, so that a translator at the translator end is not convenient to quickly correct the sub-original translated caption, and the sub-original translated caption is not attractive enough in display and is not convenient for a user to read due to the longer length when an audio and video live broadcast picture is played. Therefore, the sub-original translated captions with the number of characters exceeding the preset second number threshold can be determined as abnormal sub-original translated captions, and reminding operation is performed on the abnormal sub-original translated captions, for example, "the length of the sub-original translated captions is long, and the correction speed is required to be increased".
And step 1070, in response to the received second correction information matched with the displayed at least one target sub original translation subtitle, correcting the target sub original translation subtitle to generate a target translation subtitle.
Optionally, in response to the received second correction information matched with the displayed at least one target sub-original translated caption, correcting the target sub-original translated caption, including: responding to the received shortcut key full selection operation of the user, and performing full selection on the displayed current sub original translation subtitle to be used as a target sub original translation subtitle; and responding to the received second correction information matched with the target sub original translation caption, and performing overall correction on the target sub original translation caption.
And 1080, sending the target translation subtitles to the main control end so that the main control end adds the source subtitles and the target translation subtitles sent by each translator end into the live audio and video stream as simultaneous interpretation subtitles.
Step 1090, acquiring the playing number of the currently played sub-source subtitle of the main control end in all the played sub-source subtitles, wherein the number is sent by the main control end; and the currently played sub-target translation subtitle and the currently played sub-source subtitle share the same playing number.
In the embodiment of the present disclosure, after the main control end adds the simultaneous interpretation subtitles to the live audio/video stream, the target audio/video stream to which the simultaneous interpretation subtitles are added is played. When the target audio and video stream is played, the simultaneous interpretation subtitles in the target audio and video stream are played one by contrasting the sub-subtitle interpretation subtitles and the matched source subtitles, so that the currently played sub-source subtitles can be obtained in the target audio and video stream in real time in the process of playing the target audio and video stream, and the playing numbers of the currently played sub-source subtitles in all the played sub-source subtitles, namely the currently played sub-source subtitles, are determined. And acquiring a play code sent by the main control end, wherein the play number is the number of the currently played sub-source caption, the number of the currently played sub-target translation caption and the number of the corresponding sub-original translation caption. And displaying the playing number and the target number of the sub-original translated captions locally at the translator end, so that the translator at the translator end can clearly know the current sub-caption playing, and adjust the modification progress of the sub-original translated captions according to the playing number and the target number of each sub-original translated caption. For example, the number of the played sub-original translated captions is 6, the translator at the translator end is modifying the sub-original translated captions with the target number of 4, obviously, the sub-source captions with the number of 3 and the matched sub-original translated captions are played completely, at this time, the modification operation on the sub-original translated captions with the target number of 3 is invalid modification, and the user can adjust the modification progress on the sub-original translated captions, for example, skip the modification on the sub-original translated captions with the number of 3 and modify the sub-original translated captions with the number of 8. And the translator at the translator end can also adjust the modification progress of the sub-original translated captions according to the playing number and the target number of each sub-source caption.
Illustratively, fig. 11 is an interface schematic diagram of a showing page showing the sub-original translated subtitles and the matched sub-source subtitles, the number of characters of the sub-original translated subtitles, the target number of the sub-original translation, and the playing number of the currently played sub-source subtitles in all the played sub-source subtitles in an embodiment of the present disclosure.
The method for determining simultaneous interpretation subtitles, provided by the embodiment of the disclosure, can enable an interpreter at an interpreter side to correct the sub-original translation subtitles item by item, improve the efficiency of correcting the sub-original translation subtitles, and help the interpreter side to accurately and quickly determine the sub-target translation subtitles matched with the sub-source subtitles. In addition, the number of characters of the sub-original translated captions, the target number of the sub-original translated captions and the playing number of the currently played sub-source captions are displayed on a display page of the translator end, so that the translator of the translator end can clearly know whether the length of each sub-original translated caption is proper or not, and the translator of the translator end can timely adjust the modification progress of the sub-original translated captions according to the playing progress of the sub-source captions.
Fig. 12 is a signaling diagram of a method for determining simultaneous interpretation of subtitles according to another embodiment of the present disclosure. The method for determining simultaneous interpretation subtitles provided by the embodiment of the present disclosure can be understood with reference to fig. 12.
Fig. 13 is a schematic structural diagram of a determining apparatus for simultaneous interpretation of subtitles according to another embodiment of the present disclosure. The apparatus is applied to a master control terminal, and as shown in fig. 13, the apparatus includes: a first source subtitle obtaining module 1310, a source subtitle transmitting module 1320, and a simultaneous interpretation subtitle adding module 1330.
A first source subtitle obtaining module 1310, configured to obtain a source subtitle matching a live audio/video stream in real time;
a source subtitle transmitting module 1320, configured to transmit the source subtitle to at least one translator, where different translator are configured to translate the source subtitle into target translation subtitles in different languages;
and an simultaneous interpretation subtitle adding module 1330 configured to receive target translation subtitles fed back by the interpreters for the source subtitles, and add the source subtitles and the target translation subtitles into the live audio/video stream as simultaneous interpretation subtitles.
According to the embodiment of the disclosure, a source subtitle matched with a live audio and video stream is acquired in real time; the source subtitle is sent to at least one translator end, and different translator ends are used for translating the source subtitle into target translation subtitles of different languages; and receiving target translation subtitles fed back by the translator end aiming at the source subtitles, and adding the source subtitles and the target translation subtitles into the live audio and video stream as simultaneous interpretation subtitles. The device for determining simultaneous interpretation subtitles, provided by the embodiment of the disclosure, acquires source subtitle data matched with live audio and video streams in real time at the main control end, and acquires target translation subtitles in different languages from at least one interpreter end independently arranged at the main control end, so that simultaneous interpretation subtitles can be accurately, quickly and orderly acquired, multi-language translation subtitles can be simultaneously acquired, and the real-time performance of simultaneous interpretation is effectively ensured.
Optionally, the apparatus further comprises:
and the source subtitle modification module is used for locally displaying the source subtitle before sending the source subtitle to at least one translator, and modifying the source subtitle in response to received first modification information matched with the displayed source subtitle.
Optionally, the apparatus further comprises:
the source subtitle segmentation module is used for segmenting a source subtitle matched with a live audio and video stream in real time according to a preset mode to generate at least one sub-source subtitle;
the source caption correcting module comprises:
and the source subtitle correction unit is used for locally displaying the sub-source subtitles item by item and correcting the sub-source subtitles in response to the received first correction information matched with at least one displayed target sub-source subtitle.
Optionally, the source subtitle sending module is configured to:
and sending each sub-source caption to each translator end one by one so as to generate sub-target translation captions matched with the sub-source captions one by one through each translator end.
Optionally, the source subtitle modifying unit is configured to:
displaying a current sub-source caption in a display page, and providing a switching option for a previous caption and a next caption of the current sub-source caption in the display page;
and responding to the received selection of the switching option, acquiring the switching sub-source caption to be displayed in a display page, and moving an editing cursor to a set position in the switching sub-source caption.
Optionally, different switching options correspond to different shortcut keys.
Optionally, the source subtitle modifying unit is configured to:
responding to the received shortcut key full selection operation of the user, and performing full selection on the displayed current sub-source subtitle to be used as a target sub-source subtitle;
and responding to the received first correction information matched with the target sub-source caption, and performing overall correction on the target sub-source caption.
Optionally, the apparatus further comprises:
the first character number determining module is used for determining the number of characters contained in each sub-source caption after at least one sub-source caption is generated;
when the sub-source subtitles are locally displayed item by item, the method further comprises the following steps:
and locally displaying the number of the characters of each sub-source caption, and reminding a user of the abnormal sub-source caption when the number of the characters of the abnormal sub-source caption exceeds a preset first number threshold.
Optionally, the method further includes:
the audio and video playing module is used for playing the target audio and video stream after the source caption and each target translation caption are added into the live audio and video stream as simultaneous interpretation captions; and the target audio and video stream is a live audio and video stream with the simultaneous interpretation subtitles.
Optionally, in the process of playing the target audio/video stream, the method further includes:
acquiring a currently played sub-source subtitle from a target audio and video stream in real time, and acquiring a playing number of the currently played sub-source subtitle in all played sub-source subtitles;
locally displaying the playing number, and sending the playing number to the at least one interpreter;
after generating at least one sub-source caption, further comprising:
determining the target number of each sub-source caption in the whole source caption;
when the sub-source subtitles are locally displayed item by item, the method further comprises the following steps:
and locally displaying the target number of each sub-source caption, and sending the target number of the sub-source caption to the at least one interpreter.
Optionally, the first source subtitle obtaining module is configured to:
acquiring live audio and video streams in real time, and caching the live audio and video streams based on preset delay time;
acquiring a source subtitle matched with the live audio and video stream within the preset delay time;
the audio and video playing module is used for:
and when the preset delay time is reached, playing the target audio and video stream.
Fig. 14 is a schematic structural diagram of a determining apparatus for simultaneous interpretation of subtitles according to another embodiment of the present disclosure. The apparatus is applied to a master control terminal, and as shown in fig. 14, the apparatus includes: a second source subtitle obtaining module 1410, a translated subtitle determining module 1420, and a translated subtitle transmitting module 1430.
The second source subtitle obtaining module 1410 is used for obtaining a source subtitle which is sent by the main control end and matched with the live audio and video stream in real time;
a translated caption determining module 1420, configured to convert the source caption into a target translated caption in a language adapted by a local translator;
and the translation subtitle sending module 1430 is configured to send the target translation subtitle to the master control end, so that the master control end adds the source subtitle and the target translation subtitle sent by each translator end as simultaneous interpretation subtitles to the live audio/video stream.
According to the embodiment of the disclosure, a source subtitle which is sent by a main control end and is matched with a live audio and video stream is obtained in real time; converting the source subtitles into target translation subtitles of a language adapted by a local translator; and sending the target translation subtitles to the main control end so that the main control end adds the source subtitles and the target translation subtitles sent by each translator end into the live audio and video stream as simultaneous interpretation subtitles. According to the device for determining the simultaneous interpretation subtitles, which is provided by the embodiment of the disclosure, the interpreter end and the main control end are independently arranged, the interpreter end converts the source subtitles acquired from the main control end in real time into the target translation subtitles adapted to the local interpreter end and sends the target translation subtitles to the main control end, so that the technical problems that in the prior art, an interpreter can modify the original text and the translated text which are simultaneously interpreted in an external webpage link, the result is easily disordered, the complexity and the risk of subtitles are accurately determined are increased, the main control end can accurately, quickly and orderly acquire the simultaneous interpretation subtitles, and the real-time performance of simultaneous interpretation is effectively guaranteed.
Optionally, the translation subtitle determining module includes:
the original translation subtitle acquisition unit is used for converting the source subtitle into an original translation subtitle by using a translation tool adapted to a local translator;
and the original translation caption correcting unit is used for performing local contrast display on the original translation caption and the source caption, responding to the received second correction information matched with the displayed original translation caption, correcting the original translation caption and generating the target translation caption.
Optionally, the second source subtitle obtaining module is configured to:
acquiring a plurality of sub-source subtitles which are sent by the main control end one by one and correspond to the live audio and video stream in real time;
the original translation caption obtaining unit is configured to:
converting each sub-source caption into a sub-original translation caption by using a translation tool adapted to a local translator end;
the original translation subtitle modification unit comprises:
the contrast display subunit is used for performing local contrast display on each sub-original translation subtitle and the matched sub-source subtitle;
and the sub-original translation caption correcting subunit is used for responding to the received second correction information matched with the displayed at least one target sub-original translation caption, and correcting the target sub-original translation caption to generate the target translation caption.
Optionally, the comparison display subunit is configured to:
the method comprises the steps that a current sub-original translation subtitle and the matched sub-source subtitle are displayed in a display page in a contrasting mode, and the switching option of the previous subtitle and the next subtitle of the current sub-original translation subtitle is provided on the display page;
responding to the received selection of the switching option, acquiring switching sub original translation subtitles and matched sub source subtitles to perform contrast display in a display page, and moving an editing cursor to a set position in the switching sub original translation subtitles; and the switching option switches the current sub-original translated caption and the corresponding sub-source caption in an associated manner.
Optionally, different switching options correspond to different shortcut keys.
Optionally, the sub-original translation subtitle modification subunit is configured to:
responding to the received shortcut key full selection operation of the user, and performing full selection on the displayed current sub original translation subtitle to be used as a target sub original translation subtitle;
and responding to the received second correction information matched with the target sub original translation caption, and performing overall correction on the target sub original translation caption.
Optionally, the apparatus further comprises:
the playing number acquisition module is used for acquiring the playing number of the currently played sub-source subtitle of the main control end in all the played sub-source subtitles sent by the main control end;
the device further comprises:
the target number determining module is used for determining the target number of each sub-original translation caption in the whole original translation caption after each sub-source caption is converted into the sub-original translation caption;
when performing local comparison display on each sub-original translation subtitle and the matched sub-source subtitle, the method further comprises the following steps:
displaying the target number of each sub-original translation caption; and each sub-original translation caption and the matched sub-source caption share the same target number.
Optionally, the apparatus further comprises:
the second character number determining module is used for determining the number of characters contained in each sub-original translation caption after each sub-source caption is converted into the sub-original translation caption;
when performing local comparison display on each sub-original translation subtitle and the matched sub-source subtitle, the method further comprises the following steps:
and locally displaying the number of characters contained in each sub-original translation caption, and reminding a user of the abnormal sub-original translation caption when the number of the characters of the abnormal sub-original translation caption exceeds a preset second number threshold.
The device can execute the methods provided by all the embodiments of the disclosure, and has corresponding functional modules and beneficial effects for executing the methods. For technical details which are not described in detail in the embodiments of the present disclosure, reference may be made to the methods provided in all the aforementioned embodiments of the present disclosure.
Referring now to FIG. 15, a block diagram of an electronic device 300 suitable for use in implementing embodiments of the present disclosure is shown. The electronic device in the embodiments of the present disclosure may include, but is not limited to, a mobile terminal such as a mobile phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP (portable multimedia player), a vehicle terminal (e.g., a car navigation terminal), and the like, and a fixed terminal such as a digital TV, a desktop computer, and the like, or various forms of servers such as a stand-alone server or a server cluster. The electronic device shown in fig. 15 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 15, the electronic device 300 may include a processing means (e.g., a central processing unit, a graphic processor, etc.) 301 that may perform various appropriate actions and processes according to a program stored in a read-only memory device (ROM)302 or a program loaded from a storage device 305 into a random access memory device (RAM) 303. In the RAM 303, various programs and data necessary for the operation of the electronic apparatus 300 are also stored. The processing device 301, the ROM 302, and the RAM 303 are connected to each other via a bus 304. An input/output (I/O) interface 305 is also connected to bus 304.
Generally, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touch pad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 307 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage devices 308 including, for example, magnetic tape, hard disk, etc.; and a communication device 309. The communication means 309 may allow the electronic device 300 to communicate wirelessly or by wire with other devices to exchange data. While fig. 15 illustrates an electronic device 300 having various means, it is to be understood that not all illustrated means are required to be implemented or provided. More or fewer devices may alternatively be implemented or provided.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer-readable medium, the computer program containing program code for performing a method for recommending words. In such an embodiment, the computer program may be downloaded and installed from a network through the communication means 309, or installed from the storage means 305, or installed from the ROM 302. The computer program, when executed by the processing device 301, performs the above-described functions defined in the methods of the embodiments of the present disclosure.
It should be noted that the computer readable medium in the present disclosure can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
In some embodiments, the clients, servers may communicate using any currently known or future developed network Protocol, such as HTTP (HyperText Transfer Protocol), and may interconnect with any form or medium of digital data communication (e.g., a communications network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the Internet (e.g., the Internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed network.
The computer readable medium may be embodied in the electronic device; or may exist separately without being assembled into the electronic device.
The computer-readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to perform the method for determining a co-transliterated subtitle as provided in the first or second aspect.
Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, Smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Where the name of an element does not in some cases constitute a limitation on the element itself.
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), systems on a chip (SOCs), Complex Programmable Logic Devices (CPLDs), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
According to one or more embodiments of the present disclosure, a method for determining simultaneous interpretation of subtitles is provided in an embodiment of the present disclosure, and is applied to a main control end, including:
acquiring a source subtitle matched with a live audio and video stream in real time;
the source subtitle is sent to at least one translator end, and different translator ends are used for translating the source subtitle into target translation subtitles of different languages;
and receiving target translation subtitles fed back by the translator end aiming at the source subtitles, and adding the source subtitles and the target translation subtitles into the live audio and video stream as simultaneous interpretation subtitles.
Further, before sending the source subtitle to at least one interpreter, the method further includes:
and locally displaying the source caption, and responding to the received first correction information matched with the displayed source caption to correct the source caption.
Further, after the source subtitle matched with the live audio/video stream is obtained in real time, the method further comprises the following steps:
segmenting the source subtitles according to a preset mode to generate at least one sub-source subtitle;
locally displaying the source caption, and responding to received first correction information matched with the displayed source caption to correct the source caption, wherein the method comprises the following steps:
and locally displaying the sub-source subtitles item by item, and responding to the received first correction information matched with at least one displayed target sub-source subtitle to correct the target sub-source subtitle.
Further, the sending the source subtitle to at least one interpreter side includes:
and sending each sub-source caption to each translator end one by one so as to generate sub-target translation captions matched with the sub-source captions one by one through each translator end.
Further, locally displaying the sub-source subtitles item by item, including:
displaying a current sub-source caption in a display page, and providing a switching option for a previous caption and a next caption of the current sub-source caption in the display page;
and responding to the received selection of the switching option, acquiring the switching sub-source caption to be displayed in a display page, and moving an editing cursor to a set position in the switching sub-source caption.
Further, different switching options correspond to different shortcut keys.
Further, in response to receiving first correction information matched with the displayed at least one target sub-source subtitle, correcting the target sub-source subtitle, including:
responding to the received shortcut key full selection operation of the user, and performing full selection on the displayed current sub-source subtitle to be used as a target sub-source subtitle;
and responding to the received first correction information matched with the target sub-source caption, and performing overall correction on the target sub-source caption.
Further, after generating at least one sub-source caption, the method further includes:
determining the number of characters contained in each sub-source caption;
when the sub-source subtitles are locally displayed item by item, the method further comprises the following steps:
and locally displaying the number of the characters of each sub-source caption, and reminding a user of the abnormal sub-source caption when the number of the characters of the abnormal sub-source caption exceeds a preset first number threshold.
Further, after adding the source caption and each target translation caption as simultaneous interpretation captions to the live audio/video stream, the method further comprises the following steps:
playing the target audio and video stream; and the target audio and video stream is a live audio and video stream with the simultaneous interpretation subtitles.
Further, in the process of playing the target audio/video stream, the method further comprises:
acquiring a currently played sub-source subtitle from a target audio and video stream in real time, and acquiring a playing number of the currently played sub-source subtitle in all played sub-source subtitles;
locally displaying the playing number, and sending the playing number to the at least one interpreter;
after generating at least one sub-source caption, further comprising:
determining the target number of each sub-source caption in the whole source caption;
when the sub-source subtitles are locally displayed item by item, the method further comprises the following steps:
and locally displaying the target number of each sub-source caption, and sending the target number of the sub-source caption to the at least one interpreter.
Further, the method for acquiring the source subtitle matched with the live audio and video stream in real time comprises the following steps:
acquiring live audio and video streams in real time, and caching the live audio and video streams based on preset delay time;
acquiring a source subtitle matched with the live audio and video stream within the preset delay time;
playing the target audio and video stream, comprising:
and when the preset delay time is reached, playing the target audio and video stream.
The embodiment of the present disclosure further provides a method for determining simultaneous interpretation of subtitles, which is applied to an interpreter side, and includes:
acquiring a source subtitle which is sent by a main control end and matched with a live audio and video stream in real time;
converting the source subtitles into target translation subtitles of a language adapted by a local translator;
and sending the target translation subtitles to the main control end so that the main control end adds the source subtitles and the target translation subtitles sent by each translator end into the live audio and video stream as simultaneous interpretation subtitles.
Further, converting the source caption into a target translation caption in a language adapted by a local translator, including:
converting the source subtitle into an original translation subtitle by using a translation tool adapted by a local translator;
and performing local contrast display on the original translation subtitles and the source subtitles, responding to received second correction information matched with the displayed original translation subtitles, correcting the original translation subtitles, and generating the target translation subtitles.
Further, the method for acquiring the source subtitle matched with the live audio and video stream sent by the main control end in real time comprises the following steps:
acquiring a plurality of sub-source subtitles which are sent by the main control end one by one and correspond to the live audio and video stream in real time;
converting the source subtitle into the original translation subtitle by using a translation tool adapted by a local translator, wherein the method comprises the following steps:
converting each sub-source caption into a sub-original translation caption by using a translation tool adapted to a local translator end;
performing local comparison display on the original translated captions and the source captions, responding to received second correction information matched with the displayed original translated captions, correcting the original translated captions, and generating the target translated captions, wherein the method comprises the following steps:
performing local contrast display on each sub-original translation subtitle and the matched sub-source subtitle;
and in response to the received second correction information matched with the displayed at least one target sub original translation caption, correcting the target sub original translation caption to generate the target translation caption.
Further, performing local comparison display on each sub-original translation subtitle and the matched sub-source subtitle, including:
the method comprises the steps that a current sub-original translation subtitle and the matched sub-source subtitle are displayed in a display page in a contrasting mode, and the switching option of the previous subtitle and the next subtitle of the current sub-original translation subtitle is provided on the display page;
responding to the received selection of the switching option, acquiring switching sub original translation subtitles and matched sub source subtitles to perform contrast display in a display page, and moving an editing cursor to a set position in the switching sub original translation subtitles; and the switching option switches the current sub-original translated caption and the corresponding sub-source caption in an associated manner.
Further, different switching options correspond to different shortcut keys.
Further, in response to receiving second correction information matched with the displayed at least one target sub-original translation subtitle, correcting the target sub-original translation subtitle, including:
responding to the received shortcut key full selection operation of the user, and performing full selection on the displayed current sub original translation subtitle to be used as a target sub original translation subtitle;
and responding to the received second correction information matched with the target sub original translation caption, and performing overall correction on the target sub original translation caption.
Further, still include:
acquiring the playing number of the currently played sub-source subtitle of the main control end in all the played sub-source subtitles, wherein the number is sent by the main control end;
after converting each sub-source caption into a sub-original translation caption, the method further comprises the following steps:
determining the target number of each sub-original translation caption in the whole original translation caption;
when performing local comparison display on each sub-original translation subtitle and the matched sub-source subtitle, the method further comprises the following steps:
displaying the target number of each sub-original translation caption; and each sub-original translation caption and the matched sub-source caption share the same target number.
Further, after converting each sub-source caption into a sub-original translation caption, the method further includes:
determining the number of characters contained in each sub-original translation caption;
when performing local comparison display on each sub-original translation subtitle and the matched sub-source subtitle, the method further comprises the following steps:
and locally displaying the number of characters contained in each sub-original translation caption, and reminding a user of the abnormal sub-original translation caption when the number of the characters of the abnormal sub-original translation caption exceeds a preset second number threshold.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present disclosure and the technical principles employed. Those skilled in the art will appreciate that the present disclosure is not limited to the particular embodiments described herein, and that various obvious changes, adaptations, and substitutions are possible, without departing from the scope of the present disclosure. Therefore, although the present disclosure has been described in greater detail with reference to the above embodiments, the present disclosure is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present disclosure, the scope of which is determined by the scope of the appended claims.