CN104869326B - A kind of method for displaying image and equipment of cooperation audio - Google Patents
A kind of method for displaying image and equipment of cooperation audio Download PDFInfo
- Publication number
- CN104869326B CN104869326B CN201510279742.7A CN201510279742A CN104869326B CN 104869326 B CN104869326 B CN 104869326B CN 201510279742 A CN201510279742 A CN 201510279742A CN 104869326 B CN104869326 B CN 104869326B
- Authority
- CN
- China
- Prior art keywords
- shape
- operational scenarios
- session operational
- mouth
- speaks
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 230000003068 static effect Effects 0.000 claims abstract description 40
- 230000004044 response Effects 0.000 claims description 13
- 230000009471 action Effects 0.000 claims description 6
- 230000000694 effects Effects 0.000 abstract description 9
- 230000008859 change Effects 0.000 description 13
- 230000008569 process Effects 0.000 description 6
- 238000009434 installation Methods 0.000 description 3
- 238000012827 research and development Methods 0.000 description 3
- 238000005094 computer simulation Methods 0.000 description 2
- 235000013399 edible fruits Nutrition 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Landscapes
- User Interface Of Digital Computer (AREA)
Abstract
Embodiments of the present invention provide a kind of method for displaying image of cooperation audio.This method includes:Run session operational scenarios;Dynamic Announce is carried out to the shape of the mouth as one speaks of scene role when session operational scenarios operate in Speech time section;Static status display is carried out to the shape of the mouth as one speaks of scene role when session operational scenarios operate in time periods of silence;Wherein, Speech time section and time periods of silence are obtained from being divided to session operational scenarios according to the shape information of session operational scenarios audio, the amplitude of wave form of audio is more than the first amplitude threshold in Speech time section, the amplitude of wave form of audio is less than the second amplitude threshold in time periods of silence, and the first amplitude threshold is not less than the second amplitude threshold.Speech time section is divided by audio volume control information and time periods of silence, method of the invention enable the conversation audio of scene role to be matched with mouth shape image, so as to provide more life-like dialogue display effect to the user.In addition, embodiments of the present invention additionally provide a kind of image display of cooperation audio.
Description
Technical field
Embodiments of the present invention are related to image real time transfer field, more specifically, embodiments of the present invention are related to one
The method for displaying image and equipment of kind cooperation audio.
Background technology
Background that this section is intended to provide an explanation of the embodiments of the present invention set forth in the claims or context.Herein
Description recognizes it is the prior art not because not being included in this part.
In various game applications, animation video or Computer Simulation application program, it will usually be related to some figures
As display needs the session operational scenarios worked in coordination with audio.In these session operational scenarios, scene role can engage in the dialogue in turn.Example
Such as, game scenario session operational scenarios would generally be related in game application, in game scenario session operational scenarios, game role can take turns and flow into
Row dialogue.As it can be seen that in session operational scenarios, the sound for playing scene part dialog is not only needed, it is also necessary to presentation and conversation audio
The matched scene role shape of the mouth as one speaks, that is, needing to present the scene role shape of the mouth as one speaks when scene role speaks into Mobile state
Variation.
In order to enable the shape of the mouth as one speaks realizes dynamic variation when scene role speaks, the prior art is using for session operational scenarios
The picture for pre-setting scene role's difference shape of the mouth as one speaks, when application program runs to session operational scenarios, by scene role's difference shape of the mouth as one speaks
Picture dynamically switch display, the shape of the mouth as one speaks for allowing for the display image Scene role of session operational scenarios in this way can be into Mobile state
Variation, to being matched with the dialogue of the audio Scene role of session operational scenarios.
Invention content
It should be noted that in session operational scenarios, scene role is generally not to speak always.In many cases,
Scene role speaking in session operational scenarios has a degree of pause, that is, even if under session operational scenarios, scene role is having
A little stages are in the state spoken, and the state to seize up in some stages or time slot;Therefore, scene role is having
When a little stages or time slot are in silent state, the shape of the mouth as one speaks that scene role is presented is needed to remain unchanged, so
More life-like dialogue display effect can be generated.But in the prior art, picture of the application program to scene role's difference shape of the mouth as one speaks
Display is switched over, is carried out as unit of session operational scenarios, this makes the image Scene role's shown by session operational scenarios
The shape of the mouth as one speaks is constantly in dynamic change, that is, even if in the time slot that session operational scenarios Scene role is in silent state,
The scene role shape of the mouth as one speaks is also still in dynamic variation, to cause the conversation audio and shape of the mouth as one speaks figure of session operational scenarios Scene role
As that can not match.
Therefore in the prior art, for the part stage of session operational scenarios, even if at scene role is lower at this stage
In silent state, show that the shape of the mouth as one speaks of scene role in image is also still in dynamic variation, so as to cause talking with
The conversation audio of scene role can not be matched with mouth shape image under the part stage of scene, this is very bothersome mistake
Journey.
Thus, it is also very desirable to a kind of method for displaying image and equipment of improved cooperation audio, so that, in session operational scenarios
The stage that Scene role is in state of speaking can show the image of scene role's shape of the mouth as one speaks dynamic change, also, talk with field
The stage that scape Scene role is in silent state can show the image that scene role's shape of the mouth as one speaks remains unchanged, so that right
Talking about the conversation audio of each stage Scene role of scene can match with mouth shape image.
In the present context, embodiments of the present invention be intended to provide it is a kind of cooperation audio method for displaying image and set
It is standby.
In the first aspect of embodiment of the present invention, a kind of method for displaying image of cooperation audio is provided, including:Fortune
Row session operational scenarios;When the session operational scenarios operate in Speech time section, Dynamic Announce is carried out to the shape of the mouth as one speaks of scene role;Work as institute
When stating session operational scenarios and operating in time periods of silence, static status display is carried out to the shape of the mouth as one speaks of scene role;Wherein, the Speech time section
It is that the session operational scenarios are divided according to the shape information of audio corresponding to the session operational scenarios with the time periods of silence
Obtained from, wherein the amplitude of wave form of the shape information is more than the first amplitude threshold in the Speech time section, described
The amplitude of wave form of the shape information is less than the second amplitude threshold in time periods of silence, wherein first amplitude threshold is not small
In second amplitude threshold.
In the second aspect of embodiment of the present invention, a kind of image display of cooperation audio is provided, including:Fortune
Row module, for running session operational scenarios;Dynamic display module is right for when the session operational scenarios operate in Speech time section
The shape of the mouth as one speaks of scene role carries out Dynamic Announce;Static status display module, for when the session operational scenarios operate in time periods of silence,
Static status display is carried out to the shape of the mouth as one speaks of scene role;Wherein, the Speech time section and the time periods of silence are according to described right
Obtained from the shape information of audio corresponding to words scene divides described pair of colored scene, wherein in the Speech time
The amplitude of wave form of the shape information is more than the first amplitude threshold, the wave of the shape information in the time periods of silence in section
Shape amplitude is less than the second amplitude threshold, wherein first amplitude threshold is not less than second amplitude threshold.
According to embodiment of the present invention, method for displaying image and equipment for configuring audio, according to session operational scenarios
Audio volume control information divides session operational scenarios, when the period larger using the amplitude of wave form of audio volume control information is as voice
Between section, the period smaller using the amplitude of wave form of audio volume control information, can and when session operational scenarios are run as time periods of silence
To carry out dynamic display to the shape of the mouth as one speaks of scene role in Speech time section and can be in time periods of silence to scene role's mouth
Type carries out static display.Therefore, in session operational scenarios, show since audio volume control amplitude is larger scene role speaking and
Audio volume control amplitude is smaller to show that scene role is not speaking, therefore, the mouth of Dynamic Announce scene role in Speech time section
Type and in the shape of the mouth as one speaks of time periods of silence static status display scene role, so that it may so that only speaking in scene role in session operational scenarios
When show the image of scene role's shape of the mouth as one speaks dynamic change, and scene role mouthful is shown when session operational scenarios Scene role is silent
The image that type remains unchanged, so that the conversation audio of each stage Scene role of session operational scenarios can with mouth shape image
It enough matches, obtains more life-like dialogue display effect, better experience is brought for user.
Description of the drawings
Detailed description below, above-mentioned and other mesh of exemplary embodiment of the invention are read by reference to attached drawing
, feature and advantage will become prone to understand.In the accompanying drawings, if showing the present invention's by way of example rather than limitation
Dry embodiment, wherein:
Fig. 1 schematically shows the block schematic illustration of an exemplary application scene of embodiment of the present invention;
Fig. 2 schematically shows the flow charts for one embodiment of method for displaying image for coordinating audio in the present invention;
Fig. 3 schematically shows the flow chart for another embodiment of method for displaying image for coordinating audio in the present invention;
Fig. 4 schematically shows the flow charts for the another embodiment of method for displaying image for coordinating audio in the present invention;
Fig. 5 schematically shows the structure chart for one embodiment of image display for coordinating audio in the present invention;
In the accompanying drawings, identical or corresponding label indicates identical or corresponding part.
Specific implementation mode
The principle and spirit of the invention are described below with reference to several illustrative embodiments.It should be appreciated that providing this
A little embodiments are used for the purpose of making those skilled in the art can better understand that realizing the present invention in turn, and be not with any
Mode limits the scope of the invention.On the contrary, these embodiments are provided so that the disclosure is more thorough and complete, and energy
It is enough that the scope of the present disclosure is completely communicated to those skilled in the art.
One skilled in the art will appreciate that embodiments of the present invention can be implemented as a kind of system, device, equipment, method
Or computer program product.Therefore, the disclosure can be with specific implementation is as follows, i.e.,:Complete hardware, complete software
The form that (including firmware, resident software, microcode etc.) or hardware and software combine.
According to the embodiment of the present invention, it is proposed that a kind of method for displaying image and equipment of cooperation audio.
Herein, it is to be understood that involved term " session operational scenarios " indicates to include scene angle in application program
The plot scene segment of color dialogue, " session operational scenarios " can be realized by one or a set of file, and application program can pass through
The file of " session operational scenarios " is called to realize that " session operational scenarios " are run.Wherein, the application program with " session operational scenarios ", such as can
To be game application, Computer Simulation application etc., the present invention does not limit this.In addition, any number of elements in attached drawing is equal
Unrestricted for example and any name is only used for distinguishing, without any restrictions meaning.
Below with reference to several representative embodiments of the present invention, the principle and spirit of the invention are illustrated in detail.
Summary of the invention
The inventors discovered that in session operational scenarios, scene role is generally not to speak always, in many cases,
There are a degree of pauses for scene role speaking in session operational scenarios, that is, even under session operational scenarios, scene role
It is only in the state spoken in part stage, and is then to be in silent state in another part stage.But it is existing
In technology, application program switches over display to the picture of scene role's difference shape of the mouth as one speaks, be as unit of entire session operational scenarios into
Capable, this makes the shape of the mouth as one speaks of the image Scene role shown by session operational scenarios be constantly in dynamic change, therefore, in dialogue field
Jing Zhong, even if scene role is still in dynamic change if being in scene role's shape of the mouth as one speaks in the time slot of silent state
State, which results in the conversation audios of session operational scenarios Scene role can not be matched with mouth shape image.
The studies above based on inventor, basic principle of the invention are:In view of session operational scenarios sound intermediate frequency waveform shakes
Width is sized to reflect whether scene role is speaking, can be according to the audio volume control information of session operational scenarios to talking with field
Scape is divided;Show that scene role is speaking since audio volume control amplitude is larger, it can be with the waveform of audio volume control information
The amplitude larger period, can be in Speech time section to scene role's when session operational scenarios are run as Speech time section
The shape of the mouth as one speaks is dynamically shown so that session operational scenarios Scene role can show the shape of the mouth as one speaks figure of dynamic change when speaking
Picture;Show that scene role is not speaking since audio volume control amplitude is smaller, it can be smaller with the amplitude of wave form of audio volume control information
Period as time periods of silence, scene role can be carried out in time periods of silence when session operational scenarios are run static
Display so that session operational scenarios Scene role can show the mouth shape image remained unchanged not when speaking.Therefore, session operational scenarios
The conversation audio of each stage Scene role can be matched with mouth shape image, obtain more life-like dialogue display effect
Fruit brings better experience for user.
After the basic principle for describing the present invention, lower mask body introduces the various non-limiting embodiment party of the present invention
Formula.
Application scenarios overview
It is the block schematic illustration of an exemplary application scene of embodiments of the present invention referring initially to Fig. 1, Fig. 1.Its
In, user can interact to realize session operational scenarios with the client 102 on user equipment, run the application of the session operational scenarios
Program can be that the server 101 of application program is supplied to client 102.It will be understood by those skilled in the art that shown in Fig. 1
Block schematic illustration be only an example that embodiments of the present invention can be achieved wherein.Embodiment of the present invention
The scope of application is not limited by any aspect of the frame.
It should be noted that user equipment herein can be existing, researching and developing or research and development in the future, Neng Goutong
It crosses any type of wiredly and/or wirelessly connection (for example, Wi-Fi, LAN, honeycomb, coaxial cable etc.) and realizes client thereon
The 102 any user equipmenies interacted with server 101, including but not limited to:Existing, researching and developing or research and development in the future intelligence
It can mobile phone, non-smart mobile phone, tablet computer, laptop PC, desktop personal computer, minicomputer, medium-sized
Computer, mainframe computer etc..
It is also to be noted that server 101 herein be only it is existing, researching and developing or in the future research and development, can
Configure an example of the equipment of application system.Embodiments of the present invention are unrestricted in this regard.
Based on frame shown in FIG. 1, client 102 can run session operational scenarios.When the session operational scenarios operate in voice
When the period, client 102 can carry out Dynamic Announce to the shape of the mouth as one speaks of scene role.When the session operational scenarios operate in mute
Between section when, client 102 can to the shape of the mouth as one speaks of scene role carry out static status display.Wherein, the Speech time section and described quiet
The sound period is obtained from being divided to the session operational scenarios according to the shape information of audio corresponding to the session operational scenarios,
Wherein, the amplitude of wave form of the shape information is more than the first amplitude threshold in the Speech time section, in the mute time
The amplitude of wave form of the shape information is less than the second amplitude threshold in section, wherein first amplitude threshold is not less than described the
Two amplitude thresholds.
It is understood that the present invention application scenarios in, although herein with below by the action of embodiment of the present invention
It is described as being executed by client 102, but these actions can also partly be held by the execution of client 102, partly by server 101
Row, alternatively, these actions can also be executed by server 101.The present invention is unrestricted in terms of executive agent, as long as executing
Action disclosed in embodiment of the present invention.
Illustrative methods
With reference to the application scenarios of Fig. 1, describe to be used for according to exemplary embodiment of the invention with reference to figure 2~3
Coordinate the method for displaying image of audio.It should be noted that above application scene is merely for convenience of understanding the spirit of the present invention
It is shown with principle, embodiments of the present invention are unrestricted in this regard.On the contrary, embodiments of the present invention can answer
For applicable any scene.
Referring to Fig. 2, the flow chart for one embodiment of method for displaying image for coordinating audio in the present invention is shown.In this implementation
In example, such as it can specifically include following steps:
Step 201, operation session operational scenarios.
Step 202, when the session operational scenarios operate in Speech time section, it is aobvious into Mobile state to the shape of the mouth as one speaks of scene role
Show.
Wherein, Dynamic Announce is carried out to the shape of the mouth as one speaks of scene role, can is specifically that dynamically switching shows scene role not
With multiple pictures of the shape of the mouth as one speaks, so that the shape of the mouth as one speaks for showing scene role in showing image is in the state of dynamic change.
Step 203, when the session operational scenarios operate in time periods of silence, the shape of the mouth as one speaks of scene role is carried out static aobvious
Show.
Wherein, static display is carried out to the shape of the mouth as one speaks of scene role, such as can specifically be to maintain display scene role mouthful
The same picture of type, so that the shape of the mouth as one speaks for showing scene role in showing image is in static constant state.Alternatively,
Display to the shape of the mouth as one speaks static state of scene role can be for another example multiple pictures that switching shows the identical shape of the mouth as one speaks of scene role, to
So that the shape of the mouth as one speaks for showing scene role in showing image is in static state.
It is understood that session operational scenarios are made of Speech time section and time periods of silence, Speech time section and
Time periods of silence is to be divided to be obtained to the session operational scenarios according to the shape information of audio corresponding to the session operational scenarios
's.Specifically, the amplitude of wave form of the shape information is more than the first amplitude threshold in the Speech time section, described mute
In period the amplitude of wave form of the shape information be less than the second amplitude threshold, that is, for session operational scenarios any one when
It carves, if the amplitude of wave form of the moment subaudio frequency is more than the first amplitude threshold, which belongs to the Speech time of session operational scenarios
Section, if the amplitude of wave form of the moment subaudio frequency is less than the second amplitude threshold, which belongs to the mute time of session operational scenarios
Section.Wherein, first amplitude threshold is not less than second amplitude threshold, that is, when choosing amplitude threshold, it is selected
First amplitude threshold and selected both the second amplitude thresholds can be identical threshold values, alternatively, the first selected amplitude
Threshold value can also be more than the second selected amplitude threshold.For example, the first amplitude threshold can be arranged with the second amplitude threshold
It is 0.2 decibel.
It should be noted that in the present embodiment the Speech time section of session operational scenarios and time periods of silence can be use it is a variety of
Different modes are divided, also, may be used in session operational scenarios operational process more under section dividing mode in different times
Speech time section and time periods of silence is identified in the different mode of kind.
For example, in some embodiments of the present embodiment, Speech time section can be in dialogue field with time periods of silence
Progress is identified and is divided in real time while scape is run.Specifically, it during running session operational scenarios, can obtain in real time
The current form information for taking the audio of session operational scenarios determines that current time belongs to language according to the amplitude of wave form of current form information
Sound period or time periods of silence, wherein if the amplitude of wave form of current form information is more than the first amplitude threshold, it is determined that
Current time belongs to Speech time section, Dynamic Announce can be carried out to the shape of the mouth as one speaks of scene role, if the wave of current form information
Shape amplitude is less than the second amplitude threshold, it is determined that current time belongs to time periods of silence, can be carried out to the shape of the mouth as one speaks of scene role
Static status display.
For another example, in other embodiments of the present embodiment, Speech time section can talk with time periods of silence
Obtained from being divided in advance to session operational scenarios before scene operation, also, the Speech time section that marks off in advance and mute
Period pre-recorded before session operational scenarios operation can get off, to identify language according to record when session operational scenarios are run
Sound period and time periods of silence.Specifically, in the present embodiment, the Speech time section of the session operational scenarios and described quiet
The sound period can for example be recorded in advance as in the time segment information of session operational scenarios configuration;Step 202 can for example have
Body is:In response to determining presently described session operational scenarios according to the time segment information during running the session operational scenarios
Speech time section is operated in, Dynamic Announce is carried out to the shape of the mouth as one speaks of scene role;Step 203 for example can be specially:In response to
It runs and determines that presently described session operational scenarios operate in time periods of silence according to time segment information during the session operational scenarios,
Static status display is carried out to the shape of the mouth as one speaks of scene role.It, can be with before session operational scenarios operation more specifically, in this embodiment
The shape information of the audio of entire session operational scenarios is obtained in advance, and inscribes the amplitude of wave form of shape information when each according to session operational scenarios
With the first amplitude threshold, the magnitude relationship of the second amplitude threshold, session operational scenarios are divided into Speech time section and time periods of silence,
The Speech time section and time periods of silence of session operational scenarios are recorded as to the time segment information of session operational scenarios again, and talk with field in operation
During scape, it can determine that current time belongs to Speech time section or mute in real time by calling the time segment information
Period, wherein if the period information indicate current time belong to Speech time section, enter step 202, if this when
Between segment information expression currently belong to time periods of silence, then enter step 203.
For another example, in the other embodiment of the present embodiment, Speech time section can be right in advance with time periods of silence
Obtained from session operational scenarios are divided, and it is possible to according to the Speech time section and time periods of silence that mark off in advance, right
Video image file is generated for session operational scenarios in advance before talking about scene operation, the shape of the mouth as one speaks of video image file Scene role is made to exist
Dynamic Announce and the static status display in time periods of silence in Speech time section, so as to can be according to video when session operational scenarios are run
Image file shows to control the shape of the mouth as one speaks of scene role.Specifically, in the present embodiment, the shape of the mouth as one speaks to scene role carries out
Dynamic Announce, and, the shape of the mouth as one speaks to scene role carries out static presentation, such as may each be by running the dialogue field
The video image file that configures for the session operational scenarios in advance is played during scape to realize;The video image file exists
In image in Speech time section, the shape of the mouth as one speaks dynamic change of scene role;The video image file is in time periods of silence
In image, the shape of the mouth as one speaks static state of scene role is constant.It more specifically, in this embodiment, can before session operational scenarios operation
To obtain the shape information of the audio of entire session operational scenarios in advance, and the waveform that shape information is inscribed according to session operational scenarios when each shakes
Session operational scenarios are divided into Speech time section and mute by the magnitude relationship between width and the first amplitude threshold, the second amplitude threshold
Period is that session operational scenarios are generated in Speech time section Dynamic Announce according still further to the Speech time section and time periods of silence marked off
The scene role shape of the mouth as one speaks and in the video image file of time periods of silence static status display scene role's shape of the mouth as one speaks, and in operation session operational scenarios
During, it is only necessary to play the video image file, so that it may so that the shape of the mouth as one speaks energy of session operational scenarios operational process Scene role
Enough in Speech time section Dynamic Announce and in time periods of silence static status display, without going reality again in session operational scenarios operational process
When identify that current time belongs to Speech time section or time periods of silence.
It is understood that in the respective embodiments described above of the present embodiment, some steps are the mistakes in operation session operational scenarios
It is executed in journey, some steps first carry out in advance before running session operational scenarios.For being held during running session operational scenarios
Capable step, can be performed by operation session operational scenarios, the application program installed on the terminal device, that is, this
Class step is that application program executes during running session operational scenarios.For the step first carried out in advance before operation session operational scenarios
Suddenly, in some embodiments, such as can be executed when having installed application program update on the terminal device, this
When, this kind of step can be executed by having installed application program on the terminal device, that is, this kind of step can be at end
It is executed when mounted application program is updated itself in end equipment.For what is first carried out in advance before operation session operational scenarios
Step in other embodiments, such as can be the advance of application program before installing application program on the terminal device
It is executed in compiling procedure, at this point, this kind of step can write the equipment of application program by technical staff to execute, that is,
This kind of step can be executed by the equipment of technical staff when writing application program.
It should be noted that in session operational scenarios, on the one hand, there may be certain during keeping speaking by scene role
The pause of a little short time, as it is existing between sentence and sentence pause, existing pause between certain phrases or word, and at this
The amplitude of wave form of audio volume control information may be less than the second amplitude threshold at the pause of a little short time, this will result in scene angle
There may be the images of several sections of shape of the mouth as one speaks static status displays during keeping speaking for color so that the conversation audio and mouth of scene role
What the short time occurred in type image can not matching problem;On the other hand, scene role may deposit during keeping silent
In the noise of certain short time, and the amplitude of wave form of audio volume control information may be more than first at the noise of these short time
Amplitude threshold, this will result in scene during keeping silent there may be the image of several sections of shape of the mouth as one speaks Dynamic Announces, makes
Scene role conversation audio and mouth shape image there is the short time can not matching problem.In order to avoid above-mentioned two aspect field
Both the conversation audio of scape role and the mouth shape image short time can not matching problem, in some embodiments of the present embodiment
In, such as a minimum interval can be preset so that the Speech time section and time periods of silence of the session operational scenarios
It is not less than preset minimum interval.Wherein, which for example could be provided as 0.1 second.
For being preset with the embodiment of minimum interval, when specific implementation, before session operational scenarios operation, such as can
With elder generation according to the amplitude of wave form of audio volume control information and the first amplitude threshold, the magnitude relationship of the second amplitude threshold by session operational scenarios
It is divided into Speech time section and time periods of silence, then each Speech time section and each time periods of silence are analyzed again, for
Former and later two periods are the Speech time section of time periods of silence, if the Speech time section is less than minimum interval,
By Speech time Duan Yuqi, former and later two time periods of silence merge into a time periods of silence, and for former and later two periods
It is the time periods of silence of Speech time section, if the time periods of silence is less than minimum interval, by the time periods of silence
A Speech time section is merged into its former and later two Speech time section, can thus make the Speech time section finally obtained
It is not less than minimum interval with time periods of silence.
It is understood that since Speech time section and time periods of silence are all not less than minimum interval, Speech time
The time periods of silence of short time can be integrated into Speech time section between section, can thus be spoken in holding with scene role
The pause of short time will not cause the static status display of its shape of the mouth as one speaks in the process, so that scene role is during keeping speaking
Its mouth shape image can be always maintained at Dynamic Announce, avoid and ask can not coordinating for short time between scene role and mouth shape image
Topic;Similarly, the Speech time section of short time can be integrated into time periods of silence between time periods of silence, thus can field
Scape role noise of short time during keeping silent will not cause the Dynamic Announce of its shape of the mouth as one speaks, so that scene angle
Color its mouth shape image during keeping silent can be always maintained at static status display, avoid scene role and mouth shape image
Between the short time can not matching problem.
It should be noted that in order to which more life-like dialogue display effect is presented to user, it is contemplated that scene role sends out
The different syllable shape of the mouth as one speaks be it is different, can also be further to Speech time section in some embodiments of the present embodiment
It divides, each Speech time section is enable to correspond to the different syllable of scene role, it in this way can during running session operational scenarios
With in each Speech time section using with the respective corresponding matched specific shape of the mouth as one speaks of syllable to the shape of the mouth as one speaks of scene role into action
State is shown.Specifically, by taking scene role sends out two different syllables as an example, abovementioned steps 202 for example may include:When described
When session operational scenarios operate in the first Speech time section, Dynamic Announce is carried out to the shape of the mouth as one speaks of scene role using first shape of the mouth as one speaks;Work as institute
When stating session operational scenarios and operating in the second Speech time section, Dynamic Announce is carried out to the shape of the mouth as one speaks of scene role using second shape of the mouth as one speaks;Its
In, the first Speech time section and the second Speech time section are that the voice sound of audio is corresponded to according to the Speech time section
Obtained from section divides the Speech time section, wherein the speech syllable is in the first Speech time section
First pronunciation syllable, the speech syllable is the second pronunciation syllable in the second Speech time section;Wherein, first hair
Sound syllable is different syllable with described second syllable that pronounces, and the shape of the mouth as one speaks shape of first shape of the mouth as one speaks and second shape of the mouth as one speaks is not
Together.
When specific implementation, such as can be each pronunciation in advance for being presented the embodiment of the different shape of the mouth as one speaks for different syllables
Syllable configures corresponding shape of the mouth as one speaks shape image, when session operational scenarios are divided into Speech time section and silence period, for
The Speech time section marked off can identify the pronunciation syllable of scene role according to the shape information of its audio, and can be according to
Different pronunciation syllables further divide Speech time section so that further divide obtained each Speech time section
Corresponding different pronunciation syllable, so that each Speech time section can be respectively adopted in session operational scenarios when operating in each Speech time section
The shape of the mouth as one speaks shape image of corresponding pronunciation syllable carries out Dynamic Announce to the shape of the mouth as one speaks of scene role.More specifically, a kind of possibility
Embodiment for example can be that run session operational scenarios during, the audio for obtaining session operational scenarios in real time works as prewave
Shape information simultaneously determines that current time belongs to which pronunciation syllable time periods of silence still falls within according to current form information
Speech time section, if it is determined that current time belongs to time periods of silence, then can carry out static status display to scene role's shape of the mouth as one speaks, such as
Fruit determines that current time belongs to the Speech time section of a pronunciation syllable, then can call the shape of the mouth as one speaks shape image pair of the pronunciation syllable
The shape of the mouth as one speaks of scene role carries out Dynamic Announce;Alternatively possible embodiment for example can be, before session operational scenarios operation,
Can obtain the shape information of the audio of session operational scenarios in advance, and according to it is each when the shape information inscribed session operational scenarios are divided into
Time periods of silence and the different phonetic period for corresponding respectively to different pronunciation syllables, then by the syllable that respectively pronounces in session operational scenarios
Speech time section and time periods of silence be recorded as the time segment informations of session operational scenarios, and during running session operational scenarios,
It can determine that current time is to belong to which pronunciation sound time periods of silence still falls in real time by allocating time segment information
The Speech time section of section, if it is determined that current time belongs to time periods of silence, then can be carried out to scene role's shape of the mouth as one speaks static aobvious
Show, if it is determined that current time belongs to the Speech time section of a pronunciation syllable, then can call the shape of the mouth as one speaks shape of the pronunciation syllable
Image carries out Dynamic Announce to the shape of the mouth as one speaks of scene role;Another possible embodiment can be transported in session operational scenarios
Before row, the shape information of the audio of session operational scenarios can be obtained in advance, and according to it is each when the shape information inscribed will talk with field
Scape is divided into time periods of silence and corresponds respectively to the different phonetic period of different pronunciation syllables, quiet according still further to what is marked off
The Speech time section of sound period and each pronunciation syllable, generates video image file so that the video image for session operational scenarios
The shape of the mouth as one speaks of static status display scene role in middle time periods of silence and be utilized respectively each pronunciation in the Speech time section for the syllable that respectively pronounces
The corresponding shape of the mouth as one speaks shape image of syllable carrys out the shape of the mouth as one speaks of Dynamic Announce scene role, and when running session operational scenarios, it is only necessary to playing should
Video image file identifies that current time is to belong to which pronunciation syllable time periods of silence still falls in real time without going again
Speech time section.
Technical solution through this embodiment shows scene role in session operational scenarios since audio volume control amplitude is larger
It is speaking and audio volume control amplitude is smaller shows that scene role is not speaking, therefore, the Dynamic Announce field in Speech time section
The shape of the mouth as one speaks of scape role and in the shape of the mouth as one speaks of time periods of silence static status display scene role, so that it may so that only on the scene in session operational scenarios
Scape role shows the image of the scene role's shape of the mouth as one speaks dynamic change when speaking, and is shown when session operational scenarios Scene role is silent
The image that scene role's shape of the mouth as one speaks remains unchanged, so that the conversation audio and mouth of each stage Scene role of session operational scenarios
Type image can match, and obtain more life-like dialogue display effect.
In order to enable those skilled in the art are more clearly understood that embodiment of the present invention under concrete application scene, under
Face as a specific example, is introduced embodiment of the present invention by two application scenarios.
In Application Scenarios-Example one, the Speech time section and time periods of silence of session operational scenarios are drawn in advance by the first equipment
Divide and be recorded in time segment information, session operational scenarios then determine scene angle by the second equipment when running according to time segment information
The display mode of the color shape of the mouth as one speaks.Wherein, the first equipment for example can be that technical staff writes the terminal device of application program, provides and answer
With the server apparatus of program or the terminal device of user installation application client, the second equipment for example can be user
The terminal device of application client is installed.Specifically, the embodiment of Application Scenarios-Example one, may refer to shown in Fig. 3
The present invention in cooperation audio another embodiment of method for displaying image flow chart, the present embodiment for example can specifically include as
Lower step:
Step 301, the first equipment divide instruction in response to the period to session operational scenarios in application program, obtain dialogue field
The audio volume control information of scape.
Specifically, the first equipment can be obtained in the audio for getting session operational scenarios by being parsed to the audio
Obtain shape information.
Step 302, the first equipment according to the audio volume control information by session operational scenarios be divided into Speech time section and it is mute when
Between section.
Specifically, for any time in session operational scenarios, if the amplitude of wave form of shape information is more than the first amplitude threshold
Value, then the moment can be divided into Speech time section, if the amplitude of wave form of shape information be less than the second amplitude threshold, this when
Quarter can be divided into time periods of silence.In addition, it is more than the Speech time section of the first amplitude threshold for amplitude of wave form, it can also be by
It is divided again according to the corresponding pronunciation syllable of shape information, to mark off the Speech time section of the different pronunciation syllables of each correspondence.
Again in addition, being the Speech time section of time periods of silence for former and later two periods, if the Speech time section is less than minimum
Former and later two time periods of silence of Speech time Duan Yuqi can also be merged into a time periods of silence by time interval, also,
It is the time periods of silence of Speech time section for former and later two periods, if the time periods of silence is less than between minimum time
Every can also the time periods of silence and its former and later two Speech time section be merged into a Speech time section, can made in this way
The Speech time section and time periods of silence that must be finally obtained are not less than minimum interval.
Step 303, the first equipment by the Speech time section and time periods of silence of session operational scenarios be recorded session operational scenarios when
Between segment information, and be saved in time segment information is corresponding with session operational scenarios in application program.
Step 304, the second equipment call the period of session operational scenarios to believe in response to the triggering command of operation session operational scenarios
Breath.
Step 305, the second equipment are in the operational process of session operational scenarios, according to the time segment information of session operational scenarios, in real time
Ground determines that current time belongs to Speech time section or time periods of silence.
Step 306, the second equipment belong to Speech time section in response to current time, to the shape of the mouth as one speaks of scene role into Mobile state
Display.
Specifically, if can determine that Speech time section belongs to which pronunciation syllable, the second equipment according to time segment information
Dynamic Announce can be carried out to scene role using the shape of the mouth as one speaks shape image in advance for pronunciation syllable configuration.
Step 307, the second equipment belong to time periods of silence in response to current time, are carried out to the shape of the mouth as one speaks of scene role static
Display.
Technical solution through this embodiment can only to show scene when scene role speaks in session operational scenarios
The image of role's shape of the mouth as one speaks dynamic change, and show that scene role's shape of the mouth as one speaks is remained unchanged when session operational scenarios Scene role is silent
Image so that the conversation audio of each stage Scene role of session operational scenarios can be matched with mouth shape image,
Obtain more life-like dialogue display effect.
In Application Scenarios-Example two, the first equipment to session operational scenarios when dividing the period according to the division period in advance
Generate the video image file of scene role shape of the mouth as one speaks display mode in configured each period, the second equipment when session operational scenarios are run
It only needs to play video image file, without removing identification Speech time section and time periods of silence again.Wherein, the first equipment for example may be used
Be technical staff write application program terminal device, provide application program server apparatus or user installation application journey
The terminal device of sequence client, the second equipment for example can be the terminal devices of user installation application client.Specifically,
The embodiment of Application Scenarios-Example two, the method for displaying image that may refer to cooperation audio in the present invention shown in Fig. 4 are another
The flow chart of embodiment, the present embodiment for example can specifically include following steps:
Step 401, the first equipment divide instruction in response to the period to session operational scenarios in application program, obtain dialogue field
The audio volume control information of scape.
Specifically, the first equipment can be obtained in the audio for getting session operational scenarios by being parsed to the audio
Obtain shape information.
Step 402, the first equipment according to the audio volume control information by session operational scenarios be divided into Speech time section and it is mute when
Between section.
Specifically, for any time in session operational scenarios, if the amplitude of wave form of shape information is more than the first amplitude threshold
Value, then the moment can be divided into Speech time section, if the amplitude of wave form of shape information be less than the second amplitude threshold, this when
Quarter can be divided into time periods of silence.In addition, it is more than the Speech time section of the first amplitude threshold for amplitude of wave form, it can also be by
It is divided again according to the corresponding pronunciation syllable of shape information, to mark off the Speech time section of the different pronunciation syllables of each correspondence.
Again in addition, being the Speech time section of time periods of silence for former and later two periods, if the Speech time section is less than minimum
Former and later two time periods of silence of Speech time Duan Yuqi can also be merged into a time periods of silence by time interval, also,
It is the time periods of silence of Speech time section for former and later two periods, if the time periods of silence is less than between minimum time
Every can also the time periods of silence and its former and later two Speech time section be merged into a Speech time section, can made in this way
The Speech time section and time periods of silence that must be finally obtained are not less than minimum interval.
Step 403, the first equipment generate regarding for session operational scenarios according to the Speech time section and time periods of silence of session operational scenarios
Frequency image file, and be saved in video image file is corresponding with session operational scenarios in application program.
Wherein, in video image file, the shape of the mouth as one speaks of static status display scene role in time periods of silence, in Speech time section
The shape of the mouth as one speaks of Dynamic Announce scene role.Further, if Speech time section is further divided into the different pronunciation sounds of each correspondence
The Speech time section of section is respectively utilized as each pronunciation syllable in the Speech time section of pronunciation syllable and matches then in video image file
The shape of the mouth as one speaks shape image Dynamic Announce scene role set.
Step 404, the second equipment call the video image text of session operational scenarios in response to the triggering command of operation session operational scenarios
Part plays out.
Technical solution through this embodiment can only to show scene when scene role speaks in session operational scenarios
The image of role's shape of the mouth as one speaks dynamic change, and show that scene role's shape of the mouth as one speaks is remained unchanged when session operational scenarios Scene role is silent
Image so that the conversation audio of each stage Scene role of session operational scenarios can be matched with mouth shape image,
Obtain more life-like dialogue display effect.
Example devices
After describing the method for exemplary embodiment of the invention, next, with reference to figure 5 to the exemplary reality of the present invention
Apply mode, for coordinating the image display of audio to be introduced.
Referring to Fig. 5, the structure chart for one embodiment of image display for coordinating audio in the present invention is shown.In this implementation
Example, the equipment for example can specifically include:
Module 501 is run, for running session operational scenarios;
Dynamic display module 502, for when the session operational scenarios operate in Speech time section, to the shape of the mouth as one speaks of scene role
Carry out Dynamic Announce;
Static status display module 503, for when the session operational scenarios operate in time periods of silence, to the shape of the mouth as one speaks of scene role
Carry out static status display;
Wherein, the Speech time section and the time periods of silence are the waveforms according to audio corresponding to the session operational scenarios
Obtained from information divides the session operational scenarios, wherein the waveform of the shape information in the Speech time section
Amplitude is more than the first amplitude threshold, and the amplitude of wave form of the shape information is less than the second amplitude threshold in the time periods of silence
Value, wherein first amplitude threshold is not less than second amplitude threshold.
Optionally, in some embodiments of the present embodiment, the Speech time section of the session operational scenarios and described mute
Period can for example be recorded in advance as in the time segment information of session operational scenarios configuration;
The dynamic display module 502, be specifically used in response to during running the session operational scenarios according to
Time segment information and determine that presently described session operational scenarios operate in Speech time section, it is aobvious into Mobile state to the shape of the mouth as one speaks of scene role
Show;
The static status display module 503, be specifically used in response to during running the session operational scenarios according to the time
Segment information and determine that presently described session operational scenarios operate in time periods of silence, static status display is carried out to the shape of the mouth as one speaks of scene role.
Optionally, in other embodiments of the present embodiment, the shape of the mouth as one speaks to scene role carries out Dynamic Announce,
With, the static presentation of shape of the mouth as one speaks progress to scene role, such as may each be by the process for running the session operational scenarios
Middle broadcasting is realized for the video image file of session operational scenarios configuration in advance;The video image file is in Speech time
In image in section, the shape of the mouth as one speaks dynamic change of scene role;In image of the video image file in time periods of silence, field
The shape of the mouth as one speaks static state of scape role is constant.
Optionally, in the other embodiment of the present embodiment, the dynamic display module 502 for example can specifically wrap
It includes:
First Dynamic Announce submodule, for when the session operational scenarios operate in the first Speech time section, using first
The shape of the mouth as one speaks carries out Dynamic Announce to the shape of the mouth as one speaks of scene role;
Second Dynamic Announce submodule, for when the session operational scenarios operate in the second Speech time section, using second
The shape of the mouth as one speaks carries out Dynamic Announce to the shape of the mouth as one speaks of scene role;
Wherein, the first Speech time section and the second Speech time section are according to the Speech time section audio
Obtained from speech syllable divides the Speech time section, wherein the voice in the first Speech time section
Syllable is the first pronunciation syllable, and the speech syllable is the second pronunciation syllable in the second Speech time section;
Wherein, first shape of the mouth as one speaks is different from the shape of the mouth as one speaks shape of second shape of the mouth as one speaks.
Optionally, in some other embodiments of the present embodiment, the Speech time section of the session operational scenarios and it is mute when
Between section for example can be not less than preset minimum interval.
Technical solution through this embodiment shows scene role in session operational scenarios since audio volume control amplitude is larger
It is speaking and audio volume control amplitude is smaller shows that scene role is not speaking, therefore, the Dynamic Announce field in Speech time section
The shape of the mouth as one speaks of scape role and in the shape of the mouth as one speaks of time periods of silence static status display scene role, so that it may so that only on the scene in session operational scenarios
Scape role shows the image of the scene role's shape of the mouth as one speaks dynamic change when speaking, and is shown when session operational scenarios Scene role is silent
The image that scene role's shape of the mouth as one speaks remains unchanged, so that the conversation audio and mouth of each stage Scene role of session operational scenarios
Type image can match, and obtain more life-like dialogue display effect.
It should be noted that although be referred in above-detailed cooperation audio image display several modules or
Submodule, but this division is only not enforceable.In fact, according to the embodiment of the present invention, above-described two
The feature and function of a or more module can embody in a module.Conversely, the feature of an above-described module
It can be further divided into function and be embodied by multiple modules.
In addition, although the operation of the method for the present invention is described with particular order in the accompanying drawings, this do not require that or
Hint must execute these operations according to the particular order, or have to carry out shown in whole operation could realize it is desired
As a result.Additionally or alternatively, it is convenient to omit multiple steps are merged into a step and executed by certain steps, and/or by one
Step is decomposed into execution of multiple steps.
Although by reference to several spirit and principle that detailed description of the preferred embodimentsthe present invention has been described, it should be appreciated that, this
It is not limited to the specific embodiments disclosed for invention, does not also mean that the feature in these aspects cannot to the division of various aspects
Combination is this to divide the convenience merely to statement to be benefited.The present invention is directed to cover appended claims spirit and
Included various modifications and equivalent arrangements in range.
Claims (4)
1. a kind of method for displaying image of cooperation audio, including:
In response to running the triggering command of session operational scenarios, session operational scenarios are run;
According to the correspondence of session operational scenarios and time segment information, the time segment information of session operational scenarios is called;
According to the time segment information of session operational scenarios, determine that current time belongs to Speech time section or time periods of silence in real time;
The Speech time section and the time periods of silence are recorded in advance as in the time segment information of session operational scenarios configuration;
When the session operational scenarios operate in Speech time section, Dynamic Announce is carried out to the shape of the mouth as one speaks of scene role, is specifically included:It rings
Ying Yu determines presently described session operational scenarios and operates in during running the session operational scenarios according to the time segment information
Speech time section carries out Dynamic Announce to the shape of the mouth as one speaks of scene role;
Wherein, when the session operational scenarios operate in Speech time section, Dynamic Announce is carried out to the shape of the mouth as one speaks of scene role, including:
When the session operational scenarios operate in the first Speech time section, using first shape of the mouth as one speaks to the shape of the mouth as one speaks of scene role into Mobile state
Display;
When the session operational scenarios operate in the second Speech time section, using second shape of the mouth as one speaks to the shape of the mouth as one speaks of scene role into Mobile state
Display;
Wherein, the first Speech time section and the second Speech time section are to correspond to audio according to the Speech time section
Obtained from speech syllable divides the Speech time section, wherein the voice in the first Speech time section
Syllable is the first pronunciation syllable, and the speech syllable is the second pronunciation syllable in the second Speech time section;
Wherein, first shape of the mouth as one speaks is different from the shape of the mouth as one speaks shape of second shape of the mouth as one speaks;
When the session operational scenarios operate in time periods of silence, static status display is carried out to the shape of the mouth as one speaks of scene role, is specifically included:It rings
It is mute that Ying Yu determines that presently described session operational scenarios operate in during running the session operational scenarios according to time segment information
Period carries out static status display to the shape of the mouth as one speaks of scene role;
Wherein, the Speech time section and the time periods of silence are the shape informations according to audio corresponding to the session operational scenarios
Obtained from being divided to the session operational scenarios, wherein the amplitude of wave form of the shape information in the Speech time section
More than the first amplitude threshold, the amplitude of wave form of the shape information is less than the second amplitude threshold in the time periods of silence,
In, first amplitude threshold is not less than second amplitude threshold.
2. according to the method described in claim 1, wherein, the Speech time section and time periods of silence of the session operational scenarios be not small
In preset minimum interval.
3. a kind of image display of cooperation audio, including:
Module is run, for running session operational scenarios;
Calling module, the time segment information for calling session operational scenarios;
Determining module determines that current time belongs to Speech time section also in real time for the time segment information according to session operational scenarios
It is time periods of silence;The Speech time section and the time periods of silence were recorded in advance as the time of session operational scenarios configuration
In segment information;
Dynamic display module, for when the session operational scenarios operate in Speech time section, to the shape of the mouth as one speaks of scene role into action
State is shown, is specifically used in response to determining current institute according to the time segment information during running the session operational scenarios
It states session operational scenarios and operates in Speech time section, Dynamic Announce is carried out to the shape of the mouth as one speaks of scene role;
Wherein, the dynamic display module includes:
First Dynamic Announce submodule, for when the session operational scenarios operate in the first Speech time section, using first shape of the mouth as one speaks
Dynamic Announce is carried out to the shape of the mouth as one speaks of scene role;
Second Dynamic Announce submodule, for when the session operational scenarios operate in the second Speech time section, using second shape of the mouth as one speaks
Dynamic Announce is carried out to the shape of the mouth as one speaks of scene role;
Wherein, the first Speech time section and the second Speech time section are the voices according to the Speech time section audio
Obtained from syllable divides the Speech time section, wherein the speech syllable in the first Speech time section
For the first pronunciation syllable, the speech syllable is the second pronunciation syllable in the second Speech time section;
Wherein, first shape of the mouth as one speaks is different from the shape of the mouth as one speaks shape of second shape of the mouth as one speaks;
Static status display module, for when the session operational scenarios operate in time periods of silence, being carried out to the shape of the mouth as one speaks of scene role quiet
State is shown, is specifically used in response to determining presently described right according to time segment information during running the session operational scenarios
Words scene operates in time periods of silence, and static status display is carried out to the shape of the mouth as one speaks of scene role;
Wherein, the Speech time section and the time periods of silence are the shape informations according to audio corresponding to the session operational scenarios
Obtained from being divided to the session operational scenarios, wherein the amplitude of wave form of the shape information in the Speech time section
More than the first amplitude threshold, the amplitude of wave form of the shape information is less than the second amplitude threshold in the time periods of silence,
In, first amplitude threshold is not less than second amplitude threshold.
4. equipment according to claim 3, wherein the Speech time section and time periods of silence of the session operational scenarios be not small
In preset minimum interval.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510279742.7A CN104869326B (en) | 2015-05-27 | 2015-05-27 | A kind of method for displaying image and equipment of cooperation audio |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510279742.7A CN104869326B (en) | 2015-05-27 | 2015-05-27 | A kind of method for displaying image and equipment of cooperation audio |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104869326A CN104869326A (en) | 2015-08-26 |
CN104869326B true CN104869326B (en) | 2018-09-11 |
Family
ID=53914807
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510279742.7A Active CN104869326B (en) | 2015-05-27 | 2015-05-27 | A kind of method for displaying image and equipment of cooperation audio |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104869326B (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109168067B (en) * | 2018-11-02 | 2022-04-22 | 深圳Tcl新技术有限公司 | Video time sequence correction method, correction terminal and computer readable storage medium |
CN109600628A (en) * | 2018-12-21 | 2019-04-09 | 广州酷狗计算机科技有限公司 | Video creating method, device, computer equipment and storage medium |
CN113421543B (en) * | 2021-06-30 | 2024-05-24 | 深圳追一科技有限公司 | Data labeling method, device, equipment and readable storage medium |
CN113660537A (en) * | 2021-09-28 | 2021-11-16 | 北京七维视觉科技有限公司 | Subtitle generating method and device |
CN117714763A (en) * | 2024-02-05 | 2024-03-15 | 深圳市鸿普森科技股份有限公司 | Virtual object speech video generation method, device, electronic equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1731833A (en) * | 2005-08-23 | 2006-02-08 | 孙丹 | A method for synthesizing audio-visual files with voice-driven head images |
CN101482976A (en) * | 2009-01-19 | 2009-07-15 | 腾讯科技(深圳)有限公司 | Method for driving change of lip shape by voice, method and apparatus for acquiring lip cartoon |
CN101751692A (en) * | 2009-12-24 | 2010-06-23 | 四川大学 | Method for voice-driven lip animation |
CN104144280A (en) * | 2013-05-08 | 2014-11-12 | 上海恺达广告有限公司 | Voice and action animation synchronous control and device of electronic greeting card |
CN104574478A (en) * | 2014-12-30 | 2015-04-29 | 北京像素软件科技股份有限公司 | Method and device for editing mouth shapes of animation figures |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4568536B2 (en) * | 2004-03-17 | 2010-10-27 | ソニー株式会社 | Measuring device, measuring method, program |
KR20140114238A (en) * | 2013-03-18 | 2014-09-26 | 삼성전자주식회사 | Method for generating and displaying image coupled audio |
-
2015
- 2015-05-27 CN CN201510279742.7A patent/CN104869326B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1731833A (en) * | 2005-08-23 | 2006-02-08 | 孙丹 | A method for synthesizing audio-visual files with voice-driven head images |
CN101482976A (en) * | 2009-01-19 | 2009-07-15 | 腾讯科技(深圳)有限公司 | Method for driving change of lip shape by voice, method and apparatus for acquiring lip cartoon |
CN101751692A (en) * | 2009-12-24 | 2010-06-23 | 四川大学 | Method for voice-driven lip animation |
CN104144280A (en) * | 2013-05-08 | 2014-11-12 | 上海恺达广告有限公司 | Voice and action animation synchronous control and device of electronic greeting card |
CN104574478A (en) * | 2014-12-30 | 2015-04-29 | 北京像素软件科技股份有限公司 | Method and device for editing mouth shapes of animation figures |
Also Published As
Publication number | Publication date |
---|---|
CN104869326A (en) | 2015-08-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104869326B (en) | A kind of method for displaying image and equipment of cooperation audio | |
US11183187B2 (en) | Dialog method, dialog system, dialog apparatus and program that gives impression that dialog system understands content of dialog | |
CN107423364B (en) | Method, device and storage medium for answering operation broadcasting based on artificial intelligence | |
KR102116309B1 (en) | Synchronization animation output system of virtual characters and text | |
CN107003825A (en) | Systems and methods for controlling cinematic direction and dynamic characters through natural language output | |
KR20190109651A (en) | Voice imitation conversation service providing method and sytem based on artificial intelligence | |
WO2023209632A1 (en) | Voice attribute conversion using speech to speech | |
CN113282791A (en) | Video generation method and device | |
US20230206939A1 (en) | System and Method for Talking Avatar | |
US10825357B2 (en) | Systems and methods for variably paced real time translation between the written and spoken forms of a word | |
Yamamoto et al. | Voice interaction system with 3D-CG virtual agent for stand-alone smartphones | |
US11605390B2 (en) | Systems, methods, and apparatus for language acquisition using socio-neuorocognitive techniques | |
US11581006B2 (en) | Systems and methods for variably paced real-time translation between the written and spoken forms of a word | |
Marino et al. | Conversing using WhatsHap: A phoneme based vibrotactile messaging platform | |
Beňuš et al. | Word guessing game with a social robotic head | |
JP6755509B2 (en) | Dialogue method, dialogue system, dialogue scenario generation method, dialogue scenario generator, and program | |
CN110431622A (en) | Speech dialog method and voice dialogue device | |
KR20100040045A (en) | Method of conversation process and method of interlocutor appointment for foreign language study by communication network | |
Neto et al. | Design of a multimodal input interface for a dialogue system | |
Cosi et al. | An Italian event-based ASR-TTS system for the Nao robot | |
Favaro et al. | Learning in Expressive TTS Synthesis | |
Lee et al. | Behavior-SD: Behaviorally Aware Spoken Dialogue Generation with Large Language Models | |
KR20250065958A (en) | Method of constructing learning dataset for speech synthesis with fusion of language, speaker, and emotion within a utterance | |
Gustafson et al. | Eliciting interactional phenomena in human-human dialogues | |
Huckvale | Recording Caregiver Interactions for Machine Acquisition of Spoken Language Using the KLAIR Virtual Infant. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
EXSB | Decision made by sipo to initiate substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |