[go: up one dir, main page]

EP1944755B1 - Modification of a voice signal - Google Patents

Modification of a voice signal Download PDF

Info

Publication number
EP1944755B1
EP1944755B1 EP08100453A EP08100453A EP1944755B1 EP 1944755 B1 EP1944755 B1 EP 1944755B1 EP 08100453 A EP08100453 A EP 08100453A EP 08100453 A EP08100453 A EP 08100453A EP 1944755 B1 EP1944755 B1 EP 1944755B1
Authority
EP
European Patent Office
Prior art keywords
residual
modified
temporal envelope
modification
residue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Not-in-force
Application number
EP08100453A
Other languages
German (de)
French (fr)
Other versions
EP1944755A1 (en
Inventor
Olivier Rosec
Damien Vincent
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Orange SA
Original Assignee
France Telecom SA
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by France Telecom SA filed Critical France Telecom SA
Publication of EP1944755A1 publication Critical patent/EP1944755A1/en
Application granted granted Critical
Publication of EP1944755B1 publication Critical patent/EP1944755B1/en
Not-in-force legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/08Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/06Elementary speech units used in speech synthesisers; Concatenation rules
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/04Time compression or expansion
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • G10L2021/0135Voice conversion or morphing

Definitions

  • the present invention relates to the modification of the speech and more particularly to the modification of the acoustic parameters of decomposed speech signals into a parametric part and a non-parametric part.
  • filter-excitation It is known to decompose the speech signals according to so-called "filter-excitation" models.
  • speech is considered a glottic excitation transformed by a filter representing the vocal channel.
  • the excitation is obtained by inverse filtering of the speech signal. It sometimes includes a part that is also parametric and a residue. The residue corresponds to the difference between the excitation and the corresponding parametric modeling.
  • the frequency, rhythm or timbre information is changed through the parameters of the model.
  • Another approach is to have a model of the glottal source compact enough that the pace of the glottal signal can be controlled during changes in the signal.
  • Such an approach is described for example in the document " Toward a high-quality singing synthesizer with vocal texture control ", Stanford University, 2002 by HL Lu . Nevertheless, such a model does not capture all the information of the glottal signal. Residual information must be kept and its modification raises the problem of lack of temporal coherence mentioned above.
  • One of the objectives of the present invention is to allow such a modification.
  • the temporal coherence of the modified signal is improved.
  • said signal decomposition is a decomposition according to an excitation-filter type model.
  • Such a decomposition makes it possible to obtain a residue corresponding to a glottic excitation.
  • the estimation of the temporal envelope of the residue comprises the estimation of a first envelope, then a temporal smoothing of this first envelope.
  • This embodiment makes it possible to obtain a better estimate of the temporal envelope.
  • the method further comprises temporal normalization of the residue based on the estimate of the time envelope. This makes it possible to obtain an expression of the residue that is substantially independent of its temporal characteristics.
  • the temporal normalization of the residue comprises dividing the residue by estimating the time envelope.
  • determining a new time envelope for the residue includes changing parameters of the time envelope of the residue according to said modification instructions and applying the modified time envelope to the normalized residual.
  • the estimation of the temporal envelope and the determination of a new temporal envelope are combined.
  • the modification of acoustic characteristics comprises a modification of fundamental frequency information and duration of the parametric part and the residue.
  • the invention also relates to a program for implementing the method described above as defined in claim 9 and a corresponding device as defined in claim 10.
  • the process shown with reference to figure 1 starts with a step of analysis of the speech signal which comprises a decomposition 12 according to an excitation-filter model, that is to say a decomposition of the speech signal into a parametric part and a non-parametric part, called a residue and corresponding to a part of the glottal excitation.
  • step 12 A common practice for the implementation of step 12 is the use of linear prediction techniques such as those described in the J. Makhoul, "Linear Prediction: A tutorial Review,” Proceedings of the IEEE, vol. 63 (4), pp. 561-580, April 1975 .
  • a k denote the coefficients of an AR type filter modeling the vocal tract and the term e (n) is the residual signal relative to the excitation part, with n a signal frame index. Note that if the order of the model is large enough then e (n) is not correlated with s (n).
  • the estimation of the coefficients amounts to the inversion of a Toeplitz matrix, which can be achieved using conventional procedures and in particular using the algorithm described by J. Durbin, "The fitting of time-series models", Rev. Inst. Int. Statistics .
  • the decomposition 12 makes it possible to obtain, for excitation, a parametric model in addition to the residue.
  • the excitation-filter decomposition is carried out using a priori information on the excitation.
  • the excitation can be modeled by integrating information related to the speech production process, in particular via a parametric model of the derivative of the glottal flow wave (DODG), such as for example the LF model proposed by Liljencrants and Fant in "A four-parameter model of glottal flow", STL-QPSR, vol. 4, pp. 1-13, 1985 .
  • DODG derivative of the glottal flow wave
  • This model is entirely defined by the data of the fundamental period T0, of three shape parameters which are an open quotient of periods, an asymmetry coefficient and a return phase coefficient, of a position parameter corresponding to the instant of closure of the glottis and a term b 0 characterizing the amplitude of the DODG.
  • the method delivers a modeling of the speech signal s (n) in the form of a parametric part and a residue that is non-parametric.
  • the analysis step 10 then comprises an estimate 14 of the temporal envelope of the residue.
  • the temporal envelope is defined as the module of the analytical signal and is obtained by a so-called Hilbert transformation.
  • the estimate 14 comprises a smoothing of the temporal envelope of the residue.
  • This provides a better estimate especially for voiced sounds for which the envelope is periodic period T 0 , with T 0 denoting the inverse of the fundamental frequency f 0 .
  • the exponent T represents the transposition operator.
  • vs ( M H ⁇ M ⁇ ) - 1 M H ⁇ d
  • H denotes the Hermitian transposition operator.
  • temporal normalization means obtaining a substantially invariant residue at the time level, more precisely, obtaining a residue whose temporal envelope is constant.
  • the method comprises a step 18 for determining instructions for modifying the speech signal.
  • These instructions can be of two types.
  • a target has been defined for each of the parameters to be modified. This is particularly the case in speech synthesis where many algorithms for predicting the duration, the fundamental frequency or even the energy of the signals exist. For example, fundamental frequency and energy values can be estimated for the beginning and the end of each syllable or each phoneme of the utterance. Similarly, the duration of each syllable or phoneme can be predicted. Given these digital targets and the speech signal, modifying coefficients can be obtained by relating the measurement made on the signal to the value of the corresponding predicted target.
  • such targets are not available, but it is possible to define a set of modification coefficients for modifying the desired parameters.
  • a fundamental frequency modification coefficient of 0.5 makes it possible to divide by 2 the perceived voice height.
  • these modification coefficients can be defined globally over the entire statement or more locally, for example at the scale of a syllable or a word.
  • the method then comprises a step 20 of modifying the speech signal s (n) according to the previously determined instructions.
  • the modification step 20 firstly comprises a modification 22 of the parametric portion of the pattern corresponding to the speech signal and the normalized residue.
  • this modification relates to the fundamental frequency as well as the duration and is implemented in a conventional manner with a technique known as TD-PSOLA (in English).
  • Time Domain Pitch Synchronous Overlap and Add as described in the publication "Non-parametric techniques for pitch-scale and time-scale modification of speech", Speech Communication, Vol. 16, pp. 175-205, 1995 by E. Moulines and J. Laroche .
  • This technique makes it possible to jointly modify the duration and the fundamental frequency with the respective coefficients ⁇ (t) and ⁇ (t).
  • the Figure 2A represents the speech signal to be modified s (n).
  • this signal is segmented into so-called pitch-synchronous frames, that is to say that each segment has a duration corresponding to the inverse of the fundamental frequency of the signal.
  • the glottal closure instants also called analysis instants, are located in the vicinity of the energy maxima of the speech signal and the TD-PSOLA treatment allows a good preservation of the characteristics of the speech signal in the vicinity of the extremities.
  • segments obtained by pitch-synchronous analysis.
  • Such pitch-synchronous segmentation is obtained, for example, by time delay techniques or from the method proposed by D. Vincent, O. Rosec, and T. Chonavel, in the publication "Glottal closure instant estimation using an appropriateness measure of the source and continuity constraints", IEEE ICASSP'06, vol. 1, pp. 381-384, Toulouse, France, May 2006 .
  • this pitch-synchronous marking step is performed offline, that is to say not in real time, which reduces the calculation load for implementation in real time.
  • the signal comprises an integer number of segments or frames, each of a duration corresponding to a period which is the inverse of the modified fundamental frequency, as shown in FIG. Figure 2B .
  • the modification processing then comprises a windowing 26 of the signal around the analysis instants, that is to say the moments separating the segments. During this windowing, a portion of the windowed signal around this instant is selected for each analysis instant. This portion of the signal is called a "short-term signal" and extends, in the example, over a period corresponding to twice the fundamental period modified as shown with reference to FIG. Figure 2C .
  • the modifying processing finally comprises a summation 28 of the short-term signals which are recentered on the synthesis instants and added as shown with reference to FIG. 2D figure .
  • step 22 is performed using a HNM ( Harmonic plus Noise Model ) type technique , or a phase vocoder type technique.
  • HNM Harmonic plus Noise Model
  • phase vocoder type technique or a phase vocoder type technique.
  • the fundamental frequency and duration changes can also be made by different techniques.
  • the modified normalized residue that is to say the normalized residue whose fundamental frequency and / or time information has been modified, is denoted by ⁇ modif ( n ).
  • the method then comprises a step 30 of modifying the temporal envelope of the residue. More precisely, this step makes it possible to substitute the original temporal characteristics of the residue with temporal characteristics in accordance with the desired modifications.
  • Step 30 begins with a determination 32 of new temporal characteristics of the residue.
  • new temporal characteristics of the residue In the example, it is the modification of the temporal envelope of the residue, as obtained at the end of step 14.
  • the modification of the fundamental frequency consists of a modification of the temporal envelope to make it coherent with the normalized residual whose fundamental frequency has been modified beforehand.
  • One embodiment of such a modification consists of a dilation / contraction of the original time envelope d (n) in order to preserve the general shape.
  • the shape of the time envelope must be changed. For example, when changes in the open coefficient are made, different expansion / contraction factors should be applied to the open and closed portions of the glottal cycle, respectively.
  • Step 30 then comprises a determination of the new residue.
  • this new residue is obtained by multiplying the residue ⁇ modif (n) by the modified envelope d modif .
  • the original residue was therefore normalized, modified and then combined with the new temporal envelope. This makes it possible to ensure the coherence of its temporal envelope with the modifications of fundamental frequency and / or of vocal quality.
  • the excitation is merged with the residue, which corresponds to the case where the residue is obtained by simple inverse linear filtering and where the excitation comprises no parametric part.
  • the excitation is composed of a glottic source that can be modeled by a parametric model and of a residual
  • the method finally comprises a step 40 of synthesis of the modified signal.
  • This synthesis consists of a filtering of the signal obtained at the end of step 20 by the filter of the vocal tract as defined in step 12.
  • Step 40 also comprises an addition - overlap of the frames thus filtered.
  • This synthesis step is conventional and will not be described in more detail here.
  • the specific treatment of the temporal envelope of the residue makes it possible to obtain a modification ensuring a good temporal coherence.
  • the residue can be broken down into subbands.
  • steps 14, 16 and 20 are performed on all or part of the sub-bands considered separately.
  • the final residue obtained is then the sum of the modified residues resulting from the different subbands.
  • the residue can be decomposed into a deterministic part and a stochastic part.
  • steps 14, 16 and 20 are performed for each of the parts considered.
  • the final residue obtained is then the sum of the modified deterministic and stochastic components.
  • the different steps of the invention can be performed in a different order.
  • the time envelope is changed before changes are made to the signal.
  • the modifications are made on the residue with its new time envelope and not on the normalized residue as in the example described above.
  • the steps of normalizing the residue and determining new temporal characteristics are combined.
  • the residue is directly modified by a time factor determined from its time envelope and modification instructions. This temporal factor makes it possible both to suppress the dependence of the residue with its original temporal characteristics and to apply new temporal characteristics.
  • the invention may be implemented by a program containing specific instructions which, when executed by a computer, cause the steps described above to be carried out.
  • the invention can also be implemented by a device comprising appropriate means, such as microprocessors, microcomputers and associated memories, or programmed electronic components.
  • Such a device can be adapted to implement any embodiment of the method described above.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Oscillators With Electromechanical Resonators (AREA)
  • Electrophonic Musical Instruments (AREA)
  • Stereophonic System (AREA)

Abstract

The method involves decomposing a voice signal in a parametric part and a non parametric residue according to an excitation-filter model, and estimating a temporal envelope of the residue. Acoustic characteristics of the parametric part and the residue are modified according to modification instructions. A new temporal envelope is determined for the modified residue according to the modification instructions. The modified voice signal from the modified residue and the modified parametric part is synthesized with the new temporal envelope. Independent claims are also included for the following: (1) a program comprising instructions to perform an acoustic characteristics modifying method of a voice signal (2) a device for modifying acoustic characteristics of a voice signal.

Description

La présente invention concerne la modification de la parole et plus particulièrement, la modification des paramètres acoustiques de signaux de parole décomposés en une partie paramétrique et une partie non paramétrique.The present invention relates to the modification of the speech and more particularly to the modification of the acoustic parameters of decomposed speech signals into a parametric part and a non-parametric part.

Il est connu de décomposer les signaux de parole selon des modèles dits « filtre - excitation ». Dans ces modèles, la parole est considérée comme une excitation glottique transformée par un filtre représentant le canal vocal.It is known to decompose the speech signals according to so-called "filter-excitation" models. In these models, speech is considered a glottic excitation transformed by a filter representing the vocal channel.

L'excitation est obtenue par un filtrage inverse du signal de parole. Elle comprend parfois une partie qui est également paramétrique et un résidu. Le résidu correspond à la différence entre l'excitation et la modélisation paramétrique correspondante.The excitation is obtained by inverse filtering of the speech signal. It sometimes includes a part that is also parametric and a residue. The residue corresponds to the difference between the excitation and the corresponding parametric modeling.

Lors de la modification des signaux de parole, les informations de fréquence, de rythme ou de timbre sont modifiées par le biais des paramètres du modèle.When modifying the speech signals, the frequency, rhythm or timbre information is changed through the parameters of the model.

Toutefois, ces modifications entraînent des distorsions audibles, notamment du fait d'un manque de contrôle de la cohérence temporelle, en particulier lors des modifications de fréquence fondamentale ou de timbre.However, these modifications cause audible distortions, in particular because of a lack of control of the temporal coherence, in particular during changes in fundamental frequency or timbre.

Par exemple, le document " Applying the Harmonic plus Noise Model in concatenative speech synthesis", IEEE Transactions on Speech and Audio Processing, vol. 9 (1), pp. 21-29, January 2001 de Y. Stylianou , prévoit d'utiliser un modèle harmonique plus bruit, ou modèle NNM, avec une modulation temporelle de la partie bruitée de façon à ce qu'elle s'intègre de manière naturelle à la partie déterministe. Cependant, cette méthode ne préserve pas la cohérence temporelle de la partie déterministe.For example, the document " Applying the Harmonic Plus Noise Model in Concatenative Speech Synthesis ", IEEE Transactions on Speech and Audio Processing, vol 9 (1), pp. 21-29, January 2001 by Y. Stylianou , plans to use a harmonic model plus noise, or model NNM, with a temporal modulation of the noisy part so that it fits in a natural way to the deterministic part. However, this method does not preserve the temporal coherence of the deterministic part.

Une autre approche consiste à disposer d'un modèle de la source glottique suffisamment compact pour que l'allure du signal glottique puisse être maîtrisée lors de modifications du signal. Une telle approche est décrite par exemple dans le document " Toward a high-quality singing synthesizer with vocal texture control", Stanford University, 2002 de H. L. Lu . Néanmoins, un tel modèle ne capture pas toute l'information du signal glottique. Une information résiduelle doit être conservée et sa modification soulève le problème de manque de cohérence temporelle évoqué plus haut.Another approach is to have a model of the glottal source compact enough that the pace of the glottal signal can be controlled during changes in the signal. Such an approach is described for example in the document " Toward a high-quality singing synthesizer with vocal texture control ", Stanford University, 2002 by HL Lu . Nevertheless, such a model does not capture all the information of the glottal signal. Residual information must be kept and its modification raises the problem of lack of temporal coherence mentioned above.

Dans le document " Time-scale modification of complex acoustic signals", ICASSP 1993, vol. 1, pp. 213-216, 1993 de T. F. Quatieri, R. B. Dunn and T. E. Hanna , il est proposé une méthode de modification de signaux de parole visant à préserver à la fois l'enveloppe spectrale et l'enveloppe temporelle. Cette méthode est appliquée uniquement à la modification de la durée de signaux acoustiques et n'est pas pratique dans la mesure où il n'est théoriquement pas possible de garantir l'existence d'une solution satisfaisant simultanément à ces deux propriétés. De surcroît, il n'existe pas de résultat de convergence de l'algorithme proposé et par conséquent, cette méthode ne permet pas d'avoir un contrôle suffisant sur les caractéristiques du signal résultant.In the document " Time-scale modification of complex acoustic signals ", ICASSP 1993, vol.1, pp.213-216, 1993 by TF Quatieri, RB Dunn and TE Hanna , there is proposed a method of modifying speech signals to preserve both the spectral envelope and the temporal envelope. This method is applied only to the modification of the duration of acoustic signals and is not practical insofar as it is theoretically not possible to guarantee the existence of a solution satisfying both properties simultaneously. In addition, there is no convergence result of the proposed algorithm and therefore this method does not provide sufficient control over the characteristics of the resulting signal.

Ainsi, il n'existe pas de technique permettant de modifier les signaux de parole tout en assurant une bonne cohérence au niveau temporel.Thus, there is no technique for modifying the speech signals while ensuring good coherence at the time level.

Un des objectifs de la présente invention est de permettre une telle modification.One of the objectives of the present invention is to allow such a modification.

A cet effet, la présente invention telle que définie par la revendication 1, a pour objet un procédé de modification des caractéristiques acoustiques d'un signal de parole, caractérisé en ce qu'il comporte :

  • une décomposition du signal en une partie paramétrique et un résidu non paramétrique ;
  • une estimation de l'enveloppe temporelle du résidu ;
  • une modification de caractéristiques acoustiques de la partie paramétrique et du résidu selon des consignes de modification ;
  • une détermination d'une nouvelle enveloppe temporelle pour le résidu modifié par application desdites consignes de modification à l'enveloppe temporelle estimée; et
  • une synthèse d'un signal de parole modifié à partir de la partie paramétrique modifiée et du résidu tel que modifié et avec la nouvelle enveloppe temporelle.
For this purpose, the present invention as defined by claim 1, relates to a method for modifying the acoustic characteristics of a speech signal, characterized in that it comprises:
  • a decomposition of the signal into a parametric part and a nonparametric residue;
  • an estimate of the temporal envelope of the residue;
  • a modification of the acoustic characteristics of the parametric part and the residue according to modification instructions;
  • determining a new time envelope for the modified residue by applying said modification instructions to the estimated time envelope; and
  • a synthesis of a modified speech signal from the modified parametric part and the residue as modified and with the new time envelope.

Grâce au traitement spécifique effectué sur les caractéristiques temporelles du résidu, la cohérence temporelle du signal modifié est améliorée.Thanks to the specific processing carried out on the temporal characteristics of the residue, the temporal coherence of the modified signal is improved.

Dans un mode de réalisation de l'invention, ladite décomposition du signal est une décomposition selon un modèle de type excitation - filtre. Une telle décomposition permet d'obtenir un résidu correspondant à une excitation glottique.In one embodiment of the invention, said signal decomposition is a decomposition according to an excitation-filter type model. Such a decomposition makes it possible to obtain a residue corresponding to a glottic excitation.

Avantageusement, l'estimation de l'enveloppe temporelle du résidu comprend l'estimation d'une première enveloppe, puis un lissage temporel de cette première enveloppe. Ce mode de réalisation permet d'obtenir une meilleure estimation de l'enveloppe temporelle.Advantageously, the estimation of the temporal envelope of the residue comprises the estimation of a first envelope, then a temporal smoothing of this first envelope. This embodiment makes it possible to obtain a better estimate of the temporal envelope.

Dans un mode de réalisation particulier, le procédé comprend en outre une normalisation temporelle du résidu en fonction de l'estimation de l'enveloppe temporelle. Ceci permet d'obtenir une expression du résidu sensiblement indépendante de ses caractéristiques temporelles.In a particular embodiment, the method further comprises temporal normalization of the residue based on the estimate of the time envelope. This makes it possible to obtain an expression of the residue that is substantially independent of its temporal characteristics.

Dans un mode de réalisation particulier, la normalisation temporelle du résidu comprend la division du résidu par l'estimation de l'enveloppe temporelle.In a particular embodiment, the temporal normalization of the residue comprises dividing the residue by estimating the time envelope.

Dans un autre mode de réalisation, la détermination d'une nouvelle enveloppe temporelle pour le résidu comprend une modification de paramètres de l'enveloppe temporelle du résidu selon lesdites consignes de modification et une application de l'enveloppe temporelle modifiée au résidu normalisé.In another embodiment, determining a new time envelope for the residue includes changing parameters of the time envelope of the residue according to said modification instructions and applying the modified time envelope to the normalized residual.

Dans un mode de réalisation, l'estimation de l'enveloppe temporelle et la détermination d'une nouvelle enveloppe temporelle sont confondues.In one embodiment, the estimation of the temporal envelope and the determination of a new temporal envelope are combined.

Avantageusement, la modification de caractéristiques acoustiques comprend une modification d'informations de fréquence fondamentale et de durée de la partie paramétrique et du résidu.Advantageously, the modification of acoustic characteristics comprises a modification of fundamental frequency information and duration of the parametric part and the residue.

En outre, l'invention porte également sur un programme de mise en oeuvre du procédé décrit précédemment tel que défini dans la revendication 9 et un dispositif correspondant tel que défini dans la revendication 10.In addition, the invention also relates to a program for implementing the method described above as defined in claim 9 and a corresponding device as defined in claim 10.

L'invention sera mieux comprise à la lumière de la description faite à titre d'exemple et en référence aux figures sur lesquelles :

  • la figure 1 représente de manière générale un organigramme du procédé de l'invention ; et
  • les figures 2A à 2D représentent différents stades de traitement d'un signal de parole.
The invention will be better understood in the light of the description given by way of example and with reference to the figures in which:
  • the figure 1 generally represents a flowchart of the process of the invention; and
  • the Figures 2A to 2D represent different stages of processing a speech signal.

Le procédé représenté en référence à la figure 1 débute par une étape 10 d'analyse du signal de parole qui comprend une décomposition 12 selon un modèle excitation - filtre, c'est-à-dire une décomposition du signal de parole en une partie paramétrique et une partie non paramétrique, appelée résidu et correspondant à une partie de l'excitation glottique.The process shown with reference to figure 1 starts with a step of analysis of the speech signal which comprises a decomposition 12 according to an excitation-filter model, that is to say a decomposition of the speech signal into a parametric part and a non-parametric part, called a residue and corresponding to a part of the glottal excitation.

Une pratique courante pour la mise en oeuvre de l'étape 12 est l'utilisation de techniques de prédiction linéaire telles que celles décrites dans le document de J. Makhoul, "Linear Prediction: a tutorial review", Proceedings of the IEEE, vol. 63(4), pp. 561-580, April 1975 .A common practice for the implementation of step 12 is the use of linear prediction techniques such as those described in the J. Makhoul, "Linear Prediction: A Tutorial Review," Proceedings of the IEEE, vol. 63 (4), pp. 561-580, April 1975 .

Dans le mode de réalisation décrit en exemple, la décomposition 12 du signal de parole s(n) est réalisée à l'aide d'une auto-régression, ou modèle AR, de la forme suivante : s n = k = 1 p a k s n - k + e n .

Figure imgb0001
In the embodiment described as an example, the decomposition 12 of the speech signal s (n) is performed using a self-regression, or AR model, of the following form: s not = Σ k = 1 p at k s not - k + e not .
Figure imgb0001

Dans cette équation, les termes ak désignent les coefficients d'un filtre de type AR modélisant le conduit vocal et le terme e(n) est le signal résiduel relatif à la partie excitation, avec n un indice de trame de signal. Notons que si l'ordre du modèle est suffisamment grand alors e(n) n'est pas corrélé à s(n).In this equation, the terms a k denote the coefficients of an AR type filter modeling the vocal tract and the term e (n) is the residual signal relative to the excitation part, with n a signal frame index. Note that if the order of the model is large enough then e (n) is not correlated with s (n).

Cela s'écrit formellement E[e(n)s(n-m)]=0 pour tout entier m, où E[.] désigne l'espérance mathématique.This is formally written E [e (n) s (n-m)] = 0 for any integer m, where E [.] Denotes the expected value.

En pratique, des ordres typiques de 10 et de 16 sont choisis pour des signaux de parole échantillonnés respectivement à 8 et 16 kHz.In practice, typical orders of 10 and 16 are chosen for speech signals sampled respectively at 8 and 16 kHz.

En multipliant l'équation précédente à gauche et à droite par s(n-m) et en passant à l'espérance mathématique, on aboutit aux équations de Yule-Walker définies par : r m = - k = 1 p a k r m - k

Figure imgb0002

où r est la fonction d'autocorrélation définie par : r(m) = E[s(n)s(n-m)].By multiplying the previous equation on the left and on the right by s (nm) and passing to the mathematical expectation, we arrive at the Yule-Walker equations defined by: r m = - Σ k = 1 p at k r m - k
Figure imgb0002

where r is the autocorrelation function defined by: r (m) = E [s (n) s (nm)].

Un estimateur de r(m) est donné par : r m = 1 N - p k = 1 N - p s n s n - m .

Figure imgb0003
An estimator of r (m) is given by: r m = 1 NOT - p Σ k = 1 NOT - p s not s not - m .
Figure imgb0003

En pratique, seules les p+1 premières valeurs de la fonction d'autocorrélation sont nécessaires pour l'estimation des coefficients de filtrage ak. L'expression de cette dernière équation sous forme matricielle conduit à la résolution du système linéaire suivant : r 0 r 1 r p - 1 r 1 r 0 r p - 2 r p - 1 r p - 2 r 0 a 1 r 2 a p = r 1 r 2 r p .

Figure imgb0004
In practice, only the first p + 1 values of the autocorrelation function are necessary for the estimation of the filter coefficients a k . The expression of this last equation in matrix form leads to the resolution of the following linear system: r 0 r 1 r p - 1 r 1 r 0 r p - 2 r p - 1 r p - 2 r 0 at 1 r 2 at p = r 1 r 2 r p .
Figure imgb0004

Ainsi, l'estimation des coefficients revient à l'inversion d'une matrice de Toeplitz, ce qui peut être réalisé à l'aide de procédures classiques et notamment à l'aide de l'algorithme décrit par J. Durbin, "The fitting of time-series models", Rev. Inst. Int. Statistics .Thus, the estimation of the coefficients amounts to the inversion of a Toeplitz matrix, which can be achieved using conventional procedures and in particular using the algorithm described by J. Durbin, "The fitting of time-series models", Rev. Inst. Int. Statistics .

En variante, la décomposition 12 permet d'obtenir, pour l'excitation, un modèle paramétrique en sus du résidu.As a variant, the decomposition 12 makes it possible to obtain, for excitation, a parametric model in addition to the residue.

Par exemple, la décomposition excitation - filtre est réalisée en utilisant une information a priori sur l'excitation. Ainsi, l'excitation peut être modélisée en intégrant des informations liées au processus de production de la parole, notamment via un modèle paramétrique de la dérivée de l'onde de débit glottique (DODG), tel que par exemple le modèle LF proposé par Liljencrants et Fant dans "A four-parameter model of glottal flow", STL-QPSR, vol. 4, pp. 1-13, 1985 . Ce modèle est entièrement défini par la donnée de la période fondamentale T0, de trois paramètres de forme qui sont un quotient ouvert de périodes, un coefficient d'asymétrie et un coefficient de phase de retour, d'un paramètre de position correspondant à l'instant de fermeture de glotte et d'un terme b0 caractérisant l'amplitude de la DODG.For example, the excitation-filter decomposition is carried out using a priori information on the excitation. Thus, the excitation can be modeled by integrating information related to the speech production process, in particular via a parametric model of the derivative of the glottal flow wave (DODG), such as for example the LF model proposed by Liljencrants and Fant in "A four-parameter model of glottal flow", STL-QPSR, vol. 4, pp. 1-13, 1985 . This model is entirely defined by the data of the fundamental period T0, of three shape parameters which are an open quotient of periods, an asymmetry coefficient and a return phase coefficient, of a position parameter corresponding to the instant of closure of the glottis and a term b 0 characterizing the amplitude of the DODG.

Dans ce contexte, le signal de parole peut être représenté par le modèle d'auto-régression exogène ARX-LF suivant : s n = k = 1 p a k s n - k + b 0 u n + e n

Figure imgb0005

où u(n) désigne le signal correspondant au modèle LF de la DODG.In this context, the speech signal can be represented by the following ARX-LF exogenous self-regression model: s not = Σ k = 1 p at k s not - k + b 0 u not + e not
Figure imgb0005

where u (n) is the signal corresponding to the LF model of the DODG.

L'estimation simultanée des paramètres de la DODG et des paramètres liés au filtre est délicate, notamment parce que l'optimisation selon les paramètres de forme et de position est un problème non linéaire. Toutefois, lorsque T0 et u sont fixés, l'optimisation selon les paramètres ak et b0 est un problème linéaire classique, pour lequel un estimateur des moindres carrés peut être obtenu analytiquement. Sur la base de constat, une méthode efficace a été proposée par D. Vincent, O. Rosec, et T. Chonavel, dans la publication "Estimation of LF glottal source parameters based on ARX model", Interspeech'05, pp. 333-336, Lisbonne, Portugal, 2005 .The simultaneous estimation of the parameters of the DODG and the parameters related to the filter is delicate, in particular because the optimization according to the shape and position parameters is a non-linear problem. However, when T0 and u are fixed, the optimization according to the parameters a k and b 0 is a classical linear problem, for which a least squares estimator can be obtained analytically. On the basis of observation, an effective method has been proposed by D. Vincent, O. Rosec, and T. Chonavel, in the publication "Estimation of LF glottal source parameters based on ARX model", Interspeech'05, pp. 333-336, Lisbon, Portugal, 2005 .

Dans ce mode de réalisation, à l'issue de la procédure d'estimation, le procédé délivre :

  • des paramètres caractérisant complètement la DODG selon le modèle LF;
  • des paramètres de filtre ak ;
  • le résidu e(n) correspondant à l'erreur de modélisation liée au modèle ARX-LF.
In this embodiment, at the end of the estimation procedure, the method delivers:
  • parameters characterizing DODG completely according to the LF model;
  • filter parameters a k ;
  • the residue e (n) corresponding to the modeling error related to the ARX-LF model.

De manière générale, à l'issue de l'étape 12, le procédé délivre une modélisation du signal de parole s(n) sous la forme d'une partie paramétrique et d'un résidu qui est non paramétrique.In general, at the end of step 12, the method delivers a modeling of the speech signal s (n) in the form of a parametric part and a residue that is non-parametric.

L'étape d'analyse 10 comprend ensuite une estimation 14 de l'enveloppe temporelle du résidu.The analysis step 10 then comprises an estimate 14 of the temporal envelope of the residue.

Dans le mode de réalisation décrit, l'enveloppe temporelle est définie comme le module du signal analytique et est obtenue par une transformation dite de Hilbert. Ainsi, l'enveloppe temporelle d(t) du résidu e(t) s'écrit : d t = x e t avec x e t = e t + iH e t ,

Figure imgb0006

où H désigne l'opération de transformation de Hilbert.In the embodiment described, the temporal envelope is defined as the module of the analytical signal and is obtained by a so-called Hilbert transformation. Thus, the temporal envelope d (t) of the residue e (t) is written: d t = x e t with x e t = e t + iH e t ,
Figure imgb0006

where H denotes the Hilbert transformation operation.

Avantageusement, l'estimation 14 comprend un lissage de l'enveloppe temporelle du résidu. Cela procure une meilleure estimation notamment pour des sons voisés pour lesquels l'enveloppe est périodique de période T0, avec T0 désignant l'inverse de la fréquence fondamentale f0. Par exemple, une modélisation cepstrale d'ordre K de l'enveloppe peut être utilisée. Celle-ci s'écrit sous la forme : ln d n = 1 2 k = - K K c k exp 2 iπknf 0 / f s + ε n

Figure imgb0007
Advantageously, the estimate 14 comprises a smoothing of the temporal envelope of the residue. This provides a better estimate especially for voiced sounds for which the envelope is periodic period T 0 , with T 0 denoting the inverse of the fundamental frequency f 0 . For example, a k-mode cepstral modeling of the envelope can be used. It is written in the form: ln d not = 1 2 Σ k = - K K vs k exp 2 iπknf 0 / f s + ε not
Figure imgb0007

L'estimation des coefficients cepstraux ck se fait alors par minimisation de ε(n) au sens des moindres carrés. Plus précisément, l'équation précédente s'écrit sous la forme matricielle suivante : d = Mc + ε ,

Figure imgb0008

avec d = 1 2 [ ln d - N , , ln d N ] T
Figure imgb0009

M n+(N+1),k+(K+1) = exp(2iπknf 0/fs ), n ∈ {-N,···,N}, k ∈ {-K,···,K}
et c = [ c - K , , c K ] T .
Figure imgb0010
The cepstral coefficients c k are then estimated by minimizing ε (n) in the least squares sense. More precisely, the preceding equation is written in the following matrix form: d = Mc + ε ,
Figure imgb0008

with d = 1 2 [ ln d - NOT , , ln d NOT ] T
Figure imgb0009

M n + ( N +1) , k + ( K +1) = exp (2 i π knf 0 / f s ), n ∈ {- N , ··· , N }, k ∈ {- K , ·· ·, K }
and vs = [ vs - K , ... , vs K ] T .
Figure imgb0010

Dans ces équations, l'exposant T représente l'opérateur de transposition.In these equations, the exponent T represents the transposition operator.

La solution optimale au sens des moindres carrés est alors c = ( M H M ) - 1 M H d

Figure imgb0011

où H désigne l'opérateur de transposition hermitienne. L'enveloppe correspondante s'écrit de la façon suivante : d ^ n = 1 2 k = - K K c ^ k exp 2 iπknf 0 / f s .
Figure imgb0012
The optimal solution in the least squares sense is then vs = ( M H M ) - 1 M H d
Figure imgb0011

where H denotes the Hermitian transposition operator. The corresponding envelope is written as follows: d ^ not = 1 2 Σ k = - K K vs ^ k exp 2 iπknf 0 / f s .
Figure imgb0012

Une fois l'enveloppe temporelle du résidu estimée, le procédé comprend une étape 16 de normalisation temporelle du résidu. Dans ce document, normalisation temporelle signifie obtention d'un résidu sensiblement invariant au niveau temporel, plus précisément, obtention d'un résidu dont l'enveloppe temporelle est constante.Once the temporal envelope of the estimated residue, the method comprises a step 16 of temporal normalization of the residue. In this document, temporal normalization means obtaining a substantially invariant residue at the time level, more precisely, obtaining a residue whose temporal envelope is constant.

Dans le mode de réalisation décrit, l'étape 16 est mise en oeuvre en divisant le résidu par l'expression de l'enveloppe temporelle selon l'équation suivante : e ˜ n = e n d ^ n .

Figure imgb0013
In the embodiment described, step 16 is implemented by dividing the residue by the expression of the temporal envelope according to the following equation: e ~ not = e not d ^ not .
Figure imgb0013

En parallèle de l'analyse 10, le procédé comprend une étape 18 de détermination de consignes de modification du signal de parole. Ces consignes peuvent être de deux types.In parallel with the analysis 10, the method comprises a step 18 for determining instructions for modifying the speech signal. These instructions can be of two types.

Dans un premier cas, une cible a été définie pour chacun des paramètres à modifier. Cela est notamment le cas en synthèse de la parole où de nombreux algorithmes de prédiction de la durée, de la fréquence fondamentale ou encore de l'énergie des signaux existent. Par exemple, des valeurs de fréquence fondamentale et d'énergie peuvent être estimées pour le début et la fin de chaque syllabe ou encore de chaque phonème de l'énoncé. De même, la durée de chaque syllabe ou de chaque phonème peut être prédite. Etant donné ces cibles numériques et le signal de parole, des coefficients de modification peuvent être obtenus en faisant le rapport entre la mesure effectuée sur le signal et la valeur de la cible prédite correspondante.In a first case, a target has been defined for each of the parameters to be modified. This is particularly the case in speech synthesis where many algorithms for predicting the duration, the fundamental frequency or even the energy of the signals exist. For example, fundamental frequency and energy values can be estimated for the beginning and the end of each syllable or each phoneme of the utterance. Similarly, the duration of each syllable or phoneme can be predicted. Given these digital targets and the speech signal, modifying coefficients can be obtained by relating the measurement made on the signal to the value of the corresponding predicted target.

Dans un deuxième cas, de telles cibles ne sont pas disponibles, mais il est possible de définir un ensemble de coefficients de modification pour la modification des paramètres désirés. Par exemple, un coefficient de modification de fréquence fondamentale de 0,5 permet de diviser par 2 la hauteur de voix perçue. Notons que ces coefficients de modification peuvent être définis de manière globale sur l'ensemble de l'énoncé ou de façon plus locale, par exemple à l'échelle d'une syllabe ou d'un mot.In a second case, such targets are not available, but it is possible to define a set of modification coefficients for modifying the desired parameters. For example, a fundamental frequency modification coefficient of 0.5 makes it possible to divide by 2 the perceived voice height. Note that these modification coefficients can be defined globally over the entire statement or more locally, for example at the scale of a syllable or a word.

Le procédé comprend ensuite une étape 20 de modification du signal de parole s(n) selon les consignes déterminées précédemment.The method then comprises a step 20 of modifying the speech signal s (n) according to the previously determined instructions.

Les modifications opérées concernent la fréquence fondamentale, la durée et l'énergie des signaux de parole. En outre, lorsqu'une analyse utilisant une DODG est mise en oeuvre étant donné qu'une décomposition de type source-filtre est disponible, des modifications des paramètres de la qualité vocale peuvent être opérées en altérant le quotient ouvert, le coefficient d'asymétrie, ou encore le coefficient de phase de retour.The modifications made concern the fundamental frequency, the duration and the energy of the speech signals. In addition, when an analysis using a DODG is implemented because a source-filter decomposition is available, changes in the voice quality parameters can be made by altering the open quotient, the asymmetry coefficient , or the return phase coefficient.

L'étape 20 de modification comprend tout d'abord une modification 22 de la partie paramétrique du modèle correspondant au signal de parole et du résidu normalisé.The modification step 20 firstly comprises a modification 22 of the parametric portion of the pattern corresponding to the speech signal and the normalized residue.

Dans le mode de réalisation décrit, cette modification vise la fréquence fondamentale ainsi que la durée et est mise en oeuvre de manière classique avec une technique connue sous le nom de TD-PSOLA (en anglais Time Domain Pitch Synchronous Overlap and Add) telle que décrite dans la publication "Non-parametric techniques for pitch-scale and time-scale modification of speech", Speech Communication, vol. 16, pp. 175-205, 1995 par E. Moulines et J. Laroche .In the embodiment described, this modification relates to the fundamental frequency as well as the duration and is implemented in a conventional manner with a technique known as TD-PSOLA (in English). Time Domain Pitch Synchronous Overlap and Add) as described in the publication "Non-parametric techniques for pitch-scale and time-scale modification of speech", Speech Communication, Vol. 16, pp. 175-205, 1995 by E. Moulines and J. Laroche .

Cette technique permet d'opérer conjointement la modification de la durée et de la fréquence fondamentale avec les coefficients respectifs α(t) et β(t). This technique makes it possible to jointly modify the duration and the fundamental frequency with the respective coefficients α (t) and β (t).

En référence aux figures 2A à 2D, les principales étapes du fonctionnement de la technique TD-PSOLA sont illustrées.With reference to Figures 2A to 2D , the main steps in the operation of the TD-PSOLA technique are illustrated.

La figure 2A représente le signal de parole à modifier s(n). Au cours d'une étape 24, ce signal est segmenté en trames de manière dite pitch-synchrone, c'est-à-dire que chaque segment a une durée correspondant à l'inverse de la fréquence fondamentale du signal.The Figure 2A represents the speech signal to be modified s (n). During a step 24, this signal is segmented into so-called pitch-synchronous frames, that is to say that each segment has a duration corresponding to the inverse of the fundamental frequency of the signal.

En effet, les instants de fermeture de glotte, aussi appelés instants d'analyse, sont situés au voisinage des maxima d'énergie du signal de parole et le traitement TD-PSOLA permet une bonne préservation des caractéristiques du signal de parole au voisinage des extrémités des segments obtenus par analyse pitch-synchrone. Ainsi, lorsque ces instants sont repérés avec une précision satisfaisante, les performances de TD-PSOLA sont optimisées. Une telle segmentation pitch-synchrone est obtenue, par exemple, par des techniques à base de délais de groupe ou encore à partir de la méthode proposée par D. Vincent, O. Rosec, et T. Chonavel, dans la publication "Glottal closure instant estimation using an appropriateness measure of the source and continuity constraints", IEEE ICASSP'06, vol. 1, pp. 381-384, Toulouse, France, Mai 2006 .Indeed, the glottal closure instants, also called analysis instants, are located in the vicinity of the energy maxima of the speech signal and the TD-PSOLA treatment allows a good preservation of the characteristics of the speech signal in the vicinity of the extremities. segments obtained by pitch-synchronous analysis. Thus, when these moments are spotted with a satisfactory accuracy, the performances of TD-PSOLA are optimized. Such pitch-synchronous segmentation is obtained, for example, by time delay techniques or from the method proposed by D. Vincent, O. Rosec, and T. Chonavel, in the publication "Glottal closure instant estimation using an appropriateness measure of the source and continuity constraints", IEEE ICASSP'06, vol. 1, pp. 381-384, Toulouse, France, May 2006 .

Avantageusement, cette étape de marquage pitch-synchrone est réalisée hors-ligne, c'est-à-dire non en temps réel, ce qui permet de réduire la charge de calcul pour une mise en oeuvre en temps réel.Advantageously, this pitch-synchronous marking step is performed offline, that is to say not in real time, which reduces the calculation load for implementation in real time.

En fonction des facteurs de modification souhaités pour la fréquence fondamentale et la durée, les instants séparant les segments sont modifiés selon les règles suivantes:

  • pour un allongement de durée, certains segments sont dupliqués afin d'augmenter artificiellement le nombre d'impulsions glottiques ;
  • pour une réduction de la durée, certains segments sont supprimés;
  • pour une augmentation de la fréquence fondamentale, c'est-à-dire un rendu plus aigu, les instants d'analyse sont rapprochés, ce qui nécessite éventuellement la duplication de segments pour conserver la durée totale ; et
  • pour une diminution de la fréquence fondamentale, c'est-à-dire un rendu plus grave, les instants d'analyse sont écartés, ce qui nécessite éventuellement la suppression de segments pour conserver la durée totale.
Depending on the desired modification factors for the fundamental frequency and the duration, the instants separating the segments are modified according to the following rules:
  • for an extension of duration, certain segments are duplicated in order to artificially increase the number of glottal pulses;
  • for a reduction of the duration, certain segments are deleted;
  • for an increase of the fundamental frequency, that is to say a more acute rendering, the analysis instants are brought closer, which possibly requires the duplication of segments to preserve the total duration; and
  • for a decrease in the fundamental frequency, that is to say a more serious rendering, the analysis instants are discarded, which possibly requires the removal of segments to maintain the total duration.

Une description détaillée de ces règles se trouve dans la publication " Non-parametric techniques for pitch-scale and time-scale modification of speech", Speech Communication, vol. 16, pp. 175-205, 1995 par E. Moulines et J. Laroche .A detailed description of these rules can be found in the publication " Non-parametric techniques for pitch-scale and time-scale modification of speech ", Speech Communication, vol 16, pp. 175-205, 1995 by E. Moulines and J. Laroche .

A l'issue de cette étape, le signal comprend un nombre entier de segments ou trames, chacun d'une durée correspondant à une période qui est l'inverse de la fréquence fondamentale modifiée, comme cela est représenté sur la figure 2B.At the end of this step, the signal comprises an integer number of segments or frames, each of a duration corresponding to a period which is the inverse of the modified fundamental frequency, as shown in FIG. Figure 2B .

Le traitement de modification comprend ensuite un fenêtrage 26 du signal autour des instants d'analyse, c'est-à-dire des instants séparant les segments. Au cours de ce fenêtrage, on sélectionne, pour chaque instant d'analyse, une portion du signal fenêtrée autour de cet instant. Cette portion de signal est appelée "signal court-terme" et s'étend, dans l'exemple, sur une durée correspondant à deux fois la période fondamentale modifiée comme représenté en référence à la figure 2C.The modification processing then comprises a windowing 26 of the signal around the analysis instants, that is to say the moments separating the segments. During this windowing, a portion of the windowed signal around this instant is selected for each analysis instant. This portion of the signal is called a "short-term signal" and extends, in the example, over a period corresponding to twice the fundamental period modified as shown with reference to FIG. Figure 2C .

Le traitement de modification comprend enfin une sommation 28 des signaux court-terme qui sont recentrés sur les instants de synthèse et ajoutés comme représenté en référence à la figure 2D.The modifying processing finally comprises a summation 28 of the short-term signals which are recentered on the synthesis instants and added as shown with reference to FIG. 2D figure .

En variante, l'étape 22 est réalisée avec une technique de type HNM (en anglais Harmonic plus Noise Model), ou de type vocodeur de phase. Les modifications de fréquence fondamentale et de durée peuvent également être réalisées par des techniques différentes.In a variant, step 22 is performed using a HNM ( Harmonic plus Noise Model ) type technique , or a phase vocoder type technique. The fundamental frequency and duration changes can also be made by different techniques.

Dans la suite, le résidu normalisé modifié, c'est-à-dire le résidu normalisé dont les informations de fréquence fondamentale et/ou de durée ont été modifiées, est noté modif (n).In the following, the modified normalized residue, that is to say the normalized residue whose fundamental frequency and / or time information has been modified, is denoted by modif ( n ).

Le procédé comprend ensuite une étape 30 de modification de l'enveloppe temporelle du résidu. Plus précisément, cette étape permet de substituer aux caractéristiques temporelles d'origine du résidu, des caractéristiques temporelles en accord avec les modifications souhaitées.The method then comprises a step 30 of modifying the temporal envelope of the residue. More precisely, this step makes it possible to substitute the original temporal characteristics of the residue with temporal characteristics in accordance with the desired modifications.

L'étape 30 débute par une détermination 32 de nouvelles caractéristiques temporelles du résidu. Dans l'exemple, il s'agit de la modification de l'enveloppe temporelle du résidu, telle qu'obtenue à l'issue de l'étape 14.Step 30 begins with a determination 32 of new temporal characteristics of the residue. In the example, it is the modification of the temporal envelope of the residue, as obtained at the end of step 14.

Comme indiqué précédemment, en considérant une trame pitch-synchrone du signal, deux types de modifications peuvent être opérées conjointement ou non :

  • une modification de la fréquence fondamentale ; et
  • une modification des paramètres liés à la qualité vocale.
As indicated above, considering a pitch-synchronous frame of the signal, two types of modifications can be operated together or not:
  • a change in the fundamental frequency; and
  • a change in the parameters related to voice quality.

La modification de la fréquence fondamentale consiste en une modification de l'enveloppe temporelle pour la rendre cohérente avec le résidu normalisé dont la fréquence fondamentale a été préalablement modifiée.The modification of the fundamental frequency consists of a modification of the temporal envelope to make it coherent with the normalized residual whose fundamental frequency has been modified beforehand.

Un mode de réalisation d'une telle modification consiste en une dilatation/contraction de l'enveloppe temporelle originale d̂(n) afin d'en préserver la forme générale.One embodiment of such a modification consists of a dilation / contraction of the original time envelope d (n) in order to preserve the general shape.

Etant donné la valeur de fréquence fondamentale modifiée f 0 modif ,

Figure imgb0014
l'enveloppe temporelle modifiée dmodif s'écrit alors de la manière suivante : d modif n = exp 1 2 k = - K K c ^ k exp 2 iπknf 0 modif / f s .
Figure imgb0015
Given the modified fundamental frequency value f 0 Modified ,
Figure imgb0014
the modified temporal envelope d modif is then written as follows: d Modified not = exp 1 2 Σ k = - K K vs ^ k exp 2 iπknf 0 Modified / f s .
Figure imgb0015

Lorsque des modifications des paramètres liés à la qualité vocale sont opérées, la forme de l'enveloppe temporelle doit être modifiée. Par exemple, lorsque des modifications du coefficient ouvert sont opérées, il convient d'appliquer des facteurs de dilatation/contraction différents sur respectivement les parties ouvertes et fermées du cycle glottique.When changes to voice quality related parameters are made, the shape of the time envelope must be changed. For example, when changes in the open coefficient are made, different expansion / contraction factors should be applied to the open and closed portions of the glottal cycle, respectively.

Par exemple, on effectue une modification du quotient ouvert de sorte que la durée de la phase ouverte devienne T e mod if

Figure imgb0016
avec T e mod if < T 0 ,
Figure imgb0017
avec T0 qui est la longueur d'un cycle glottique dont l'instant de fermeture coïncide avec l'origine des temps et une phase ouverte originale de durée Te. Dans ce cas, pour conserver la même période fondamentale, il convient de dilater le signal selon les coefficients suivants : α 1 = T 0 - T e modif T 0 - T e pour la phase fermée ;
Figure imgb0018
α 2 = T e modif T e pour la phase ouverte .
Figure imgb0019
For example, we make a modification of the open quotient so that the duration of the open phase becomes T e mod yew
Figure imgb0016
with T e mod yew < T 0 ,
Figure imgb0017
with T 0 which is the length of a glottal cycle whose closing moment coincides with the origin of the times and an original open phase of duration T e . In this case, to maintain the same fundamental period, the signal must be dilated according to the following coefficients: α 1 = T 0 - T e Modified T 0 - T e for the closed phase ;
Figure imgb0018
α 2 = T e Modified T e for the open phase .
Figure imgb0019

Mathématiquement, cela revient à déterminer une enveloppe temporelle de la forme suivante : d modif t = exp 1 2 k = - K K c ^ k exp 2 iπkg t / T 0 modif ,

Figure imgb0020

où la fonction g est définie par : g t = { T 0 - T e modif T 0 - T e t pour t 0 , T 0 - T e T 0 - T e modif + T e modif T e t T 0 - T e pour t T 0 - T e , T 0 .
Figure imgb0021
Mathematically, this amounts to determining a temporal envelope of the following form: d Modified t = exp 1 2 Σ k = - K K vs ^ k exp 2 iπkg t / T 0 Modified ,
Figure imgb0020

where the function g is defined by: boy Wut t = { T 0 - T e Modified T 0 - T e t for t 0 , T 0 - T e T 0 - T e Modified + T e Modified T e t T 0 - T e for t T 0 - T e , T 0 .
Figure imgb0021

Bien entendu, d'autres types de modification de paramètres de la qualité vocale sont possibles selon des principes similaires.Of course, other types of modification of voice quality parameters are possible according to similar principles.

L'étape 30 comprend ensuite une détermination 34 du nouveau résidu. Dans l'exemple, ce nouveau résidu s'obtient par multiplication du résidu modif (n) par l'enveloppe modifiée dmodif .Step 30 then comprises a determination of the new residue. In the example, this new residue is obtained by multiplying the residue modif (n) by the modified envelope d modif .

Le résidu d'origine a donc été normalisé, modifié, puis combiné avec la nouvelle enveloppe temporelle. Ceci permet d'assurer la cohérence de son enveloppe temporelle avec les modifications de fréquence fondamentale et/ou de qualité vocale.The original residue was therefore normalized, modified and then combined with the new temporal envelope. This makes it possible to ensure the coherence of its temporal envelope with the modifications of fundamental frequency and / or of vocal quality.

Dans le mode de réalisation décrit, l'excitation est confondue avec le résidu, ce qui correspond au cas où le résidu est obtenu par simple filtrage linéaire inverse et où l'excitation ne comporte par de partie paramétrique.In the embodiment described, the excitation is merged with the residue, which corresponds to the case where the residue is obtained by simple inverse linear filtering and where the excitation comprises no parametric part.

Dans le cas où l'excitation est composée d'une source glottique modélisable par un modèle paramétrique et d'un résidu, il convient d'opérer le même type de modification sur la source glottique ainsi paramétrée en ajustant les paramètres de fréquence fondamentale et de qualité vocale.In the case where the excitation is composed of a glottic source that can be modeled by a parametric model and of a residual, it is advisable to make the same type of modification on the glottic source thus parameterized by adjusting the parameters of fundamental frequency and of voice quality.

Le procédé comporte enfin une étape 40 de synthèse du signal modifié. Cette synthèse consiste en un filtrage du signal obtenu à l'issue de l'étape 20 par le filtre du conduit vocal tel que défini lors de l'étape 12. L'étape 40 comprend également une addition - recouvrement des trames ainsi filtrées. Cette étape de synthèse est classique et ne sera pas décrite plus en détails ici.The method finally comprises a step 40 of synthesis of the modified signal. This synthesis consists of a filtering of the signal obtained at the end of step 20 by the filter of the vocal tract as defined in step 12. Step 40 also comprises an addition - overlap of the frames thus filtered. This synthesis step is conventional and will not be described in more detail here.

Ainsi, le traitement spécifique de l'enveloppe temporelle du résidu permet d'obtenir une modification assurant une bonne cohérence temporelle.Thus, the specific treatment of the temporal envelope of the residue makes it possible to obtain a modification ensuring a good temporal coherence.

Bien entendu, d'autres modes de réalisation peuvent être envisagés.Of course, other embodiments can be envisaged.

Tout d'abord, le résidu peut être décomposé en sous-bandes. Dans ce cas, les étapes 14, 16 et 20 sont réalisées sur tout ou partie des sous-bandes considérées séparément. Le résidu final obtenu est alors la somme des résidus modifiés issus des différentes sous-bandes.First, the residue can be broken down into subbands. In this case, steps 14, 16 and 20 are performed on all or part of the sub-bands considered separately. The final residue obtained is then the sum of the modified residues resulting from the different subbands.

En outre, le résidu peut faire l'objet d'une décomposition en une partie déterministe et une partie stochastique. Dans ce cas, les étapes 14, 16 et 20 sont réalisées pour chacune des parties considérées. Là encore, le résidu final obtenu est alors la somme des composantes déterministes et stochastiques modifiées.In addition, the residue can be decomposed into a deterministic part and a stochastic part. In this case, steps 14, 16 and 20 are performed for each of the parts considered. Here again, the final residue obtained is then the sum of the modified deterministic and stochastic components.

En outre, ces deux variantes peuvent être combinées, de sorte qu'un traitement séparé sur chaque sous-bande et pour chacune des composantes déterministe et stochastique peut être effectué.In addition, these two variants can be combined, so that a separate processing on each sub-band and for each of the deterministic and stochastic components can be performed.

Dans un autre mode de réalisation, les différentes étapes de l'invention peuvent être réalisées dans un ordre différent. Par exemple, l'enveloppe temporelle est modifiée avant que les modifications ne soient faites sur le signal. Ainsi, les modifications sont apportées sur le résidu avec sa nouvelle enveloppe temporelle et non sur le résidu normalisé comme dans l'exemple décrit précédemment.In another embodiment, the different steps of the invention can be performed in a different order. For example, the time envelope is changed before changes are made to the signal. Thus, the modifications are made on the residue with its new time envelope and not on the normalized residue as in the example described above.

Selon un autre mode de réalisation, les étapes de normalisation du résidu et de détermination de nouvelles caractéristiques temporelles sont combinées. Dans un tel mode de réalisation, le résidu est directement modifié par un facteur temporel déterminé à partir de son enveloppe temporelle et des consignes de modification. Ce facteur temporel permet à la fois de supprimer la dépendance du résidu avec ses caractéristiques temporelles d'origine et d'appliquer de nouvelles caractéristiques temporelles.According to another embodiment, the steps of normalizing the residue and determining new temporal characteristics are combined. In such an embodiment, the residue is directly modified by a time factor determined from its time envelope and modification instructions. This temporal factor makes it possible both to suppress the dependence of the residue with its original temporal characteristics and to apply new temporal characteristics.

Par ailleurs, l'invention peut être mise en oeuvre par un programme contenant des instructions spécifiques qui, lorsqu'elles sont exécutées par un calculateur, entraînent la réalisation des étapes décrites précédemment.Furthermore, the invention may be implemented by a program containing specific instructions which, when executed by a computer, cause the steps described above to be carried out.

L'invention peut également être mise en oeuvre par un dispositif comportant des moyens appropriés, tels que des microprocesseurs, microcalculateurs et mémoires associées, ou encore des composants électroniques programmés.The invention can also be implemented by a device comprising appropriate means, such as microprocessors, microcomputers and associated memories, or programmed electronic components.

Un tel dispositif peut être adapté pour mettre en oeuvre n'importe quel mode de réalisation du procédé décrit précédemment.Such a device can be adapted to implement any embodiment of the method described above.

Claims (11)

  1. Method of modifying the acoustic characteristics of a speech signal (s(n)) characterized in that it comprises:
    - a decomposition (12) of the signal into a parametric part and a non-parametric residual (e(n));
    - an estimation (14) of the temporal envelope of the residual;
    - a modification (22) of acoustic characteristics of the parametric part and of the residual according to modification guidelines;
    - a determination (30) of a new temporal envelope for the modified residual by applying the said modification guidelines to the estimated temporal envelope; and
    - a synthesis (40) of a speech signal modified on the basis of the modified parametric part and of the residual as modified and with the new temporal envelope.
  2. Method according to Claim 1, characterized in that the said decomposition of the signal is a decomposition according to a model of excitation-filter type.
  3. Method according to either of Claims 1 and 2, characterized in that the estimation of the temporal envelope of the residual comprises the estimation of a first envelope and then a temporal smoothing of this first envelope.
  4. Method according to any one of Claims 1 to 3, characterized in that it furthermore comprises a temporal normalization (16) of the residual as a function of the estimation of the temporal envelope.
  5. Method according to Claim 4, characterized in that the temporal normalization of the residual comprises the division of the residual by the estimation of the temporal envelope.
  6. Method according to Claim 4 or 5, characterized in that the determination of a new temporal envelope for the residual comprises a modification (32) of parameters of the temporal envelope of the residual according to the said modification guidelines and an application (34) of the modified temporal envelope to the normalized residual.
  7. Method according to either of Claims 1 and 5, characterized in that the estimation of the temporal envelope and the determination of a new temporal envelope are merged.
  8. Method according to any one of Claims 1 to 7, characterized in that the modification of acoustic characteristics comprises a modification of information regarding fundamental frequency and duration of the parametric part and of the residual.
  9. Program for a device for modifying a speech signal (s(n)), characterized in that it comprises instructions which, when they are executed on a computer of this device, bring about the implementation of a method according to any one of Claims 1 to 8.
  10. Device for modifying a speech signal, characterized in that it comprises:
    - means for decomposing the signal into a parametric part and a non-parametric residual (e(n));
    - means for estimating the temporal envelope of the residual;
    - means for modifying acoustic characteristics of the parametric part and of the residual according to modification guidelines;
    - means for determining a new temporal envelope for the modified residual by applying the said modification guidelines to the estimated temporal envelope; and
    - means for synthesizing a speech signal modified on the basis of the modified parametric part and of the residual as modified and with the new temporal envelope.
  11. Device according to Claim 10, characterized in that it comprises means suitable for the implementation of a method according to any one of Claims 2 to 8.
EP08100453A 2007-01-15 2008-01-14 Modification of a voice signal Not-in-force EP1944755B1 (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
FR0700257A FR2911426A1 (en) 2007-01-15 2007-01-15 MODIFICATION OF A SPEECH SIGNAL

Publications (2)

Publication Number Publication Date
EP1944755A1 EP1944755A1 (en) 2008-07-16
EP1944755B1 true EP1944755B1 (en) 2010-03-17

Family

ID=38232910

Family Applications (1)

Application Number Title Priority Date Filing Date
EP08100453A Not-in-force EP1944755B1 (en) 2007-01-15 2008-01-14 Modification of a voice signal

Country Status (5)

Country Link
US (1) US20080208599A1 (en)
EP (1) EP1944755B1 (en)
AT (1) ATE461514T1 (en)
DE (1) DE602008000802D1 (en)
FR (1) FR2911426A1 (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4246792B2 (en) * 2007-05-14 2009-04-02 パナソニック株式会社 Voice quality conversion device and voice quality conversion method
US9189137B2 (en) 2010-03-08 2015-11-17 Magisto Ltd. Method and system for browsing, searching and sharing of personal video by a non-parametric approach
US9502073B2 (en) * 2010-03-08 2016-11-22 Magisto Ltd. System and method for semi-automatic video editing
US9554111B2 (en) 2010-03-08 2017-01-24 Magisto Ltd. System and method for semi-automatic video editing
JP5914527B2 (en) 2011-02-14 2016-05-11 フラウンホーファー−ゲゼルシャフト・ツール・フェルデルング・デル・アンゲヴァンテン・フォルシュング・アインゲトラーゲネル・フェライン Apparatus and method for encoding a portion of an audio signal using transient detection and quality results
WO2012110416A1 (en) 2011-02-14 2012-08-23 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Encoding and decoding of pulse positions of tracks of an audio signal
MX2013009344A (en) 2011-02-14 2013-10-01 Fraunhofer Ges Forschung Apparatus and method for processing a decoded audio signal in a spectral domain.
TWI488176B (en) 2011-02-14 2015-06-11 Fraunhofer Ges Forschung Encoding and decoding of pulse positions of tracks of an audio signal
TWI479478B (en) 2011-02-14 2015-04-01 Fraunhofer Ges Forschung Apparatus and method for decoding an audio signal using an aligned look-ahead portion
MY165853A (en) 2011-02-14 2018-05-18 Fraunhofer Ges Forschung Linear prediction based coding scheme using spectral domain noise shaping
CA2827000C (en) 2011-02-14 2016-04-05 Jeremie Lecomte Apparatus and method for error concealment in low-delay unified speech and audio coding (usac)
TWI483245B (en) 2011-02-14 2015-05-01 Fraunhofer Ges Forschung Information signal representation using lapped transform
KR101624019B1 (en) * 2011-02-14 2016-06-07 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. Noise generation in audio codecs
MX2013009303A (en) 2011-02-14 2013-09-13 Fraunhofer Ges Forschung Audio codec using noise synthesis during inactive phases.
US9508329B2 (en) * 2012-11-20 2016-11-29 Huawei Technologies Co., Ltd. Method for producing audio file and terminal device
CN111798831B (en) * 2020-06-16 2023-11-28 武汉理工大学 Sound particle synthesis method and device

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5864809A (en) * 1994-10-28 1999-01-26 Mitsubishi Denki Kabushiki Kaisha Modification of sub-phoneme speech spectral models for lombard speech recognition
ATE179827T1 (en) * 1994-11-25 1999-05-15 Fleming K Fink METHOD FOR CHANGING A VOICE SIGNAL USING BASE FREQUENCY MANIPULATION
US6182042B1 (en) * 1998-07-07 2001-01-30 Creative Technology Ltd. Sound modification employing spectral warping techniques
US6385573B1 (en) * 1998-08-24 2002-05-07 Conexant Systems, Inc. Adaptive tilt compensation for synthesized speech residual
US7394833B2 (en) * 2003-02-11 2008-07-01 Nokia Corporation Method and apparatus for reducing synchronization delay in packet switched voice terminals using speech decoder modification
ES2354427T3 (en) * 2003-06-30 2011-03-14 Koninklijke Philips Electronics N.V. IMPROVEMENT OF THE DECODED AUDIO QUALITY THROUGH THE ADDITION OF NOISE.
GB0326263D0 (en) * 2003-11-11 2003-12-17 Nokia Corp Speech codecs
US7720230B2 (en) * 2004-10-20 2010-05-18 Agere Systems, Inc. Individual channel shaping for BCC schemes and the like
US8204261B2 (en) * 2004-10-20 2012-06-19 Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. Diffuse sound shaping for BCC schemes and the like
WO2006106466A1 (en) * 2005-04-07 2006-10-12 Koninklijke Philips Electronics N.V. Method and signal processor for modification of audio signals

Also Published As

Publication number Publication date
US20080208599A1 (en) 2008-08-28
EP1944755A1 (en) 2008-07-16
DE602008000802D1 (en) 2010-04-29
ATE461514T1 (en) 2010-04-15
FR2911426A1 (en) 2008-07-18

Similar Documents

Publication Publication Date Title
EP1944755B1 (en) Modification of a voice signal
DK2579249T3 (en) PARAMETER SPEECH SYNTHESIS PROCEDURE AND SYSTEM
EP1970894A1 (en) Method and device for modifying an audio signal
FR2553555A1 (en) SPEECH CODING METHOD AND DEVICE FOR IMPLEMENTING IT
WO2005106852A1 (en) Improved voice signal conversion method and system
WO2005106853A1 (en) Method and system for the quick conversion of a voice signal
EP0195441B1 (en) Method for low bite rate speech coding using a multipulse excitation signal
Al-Radhi et al. Time-Domain Envelope Modulating the Noise Component of Excitation in a Continuous Residual-Based Vocoder for Statistical Parametric Speech Synthesis.
EP1526508B1 (en) Method for the selection of synthesis units
EP1606792A1 (en) Method for analyzing fundamental frequency information and voice conversion method and system implementing said analysis method
van Santen et al. Estimating phrase curves in the general superpositional intonation model.
Kafentzis et al. Pitch modifications of speech based on an adaptive harmonic model
FR3020732A1 (en) PERFECTED FRAME LOSS CORRECTION WITH VOICE INFORMATION
d'Alessandro et al. Experiments in voice quality modification of natural speech signals: the spectral approach.
EP1846918B1 (en) Method of estimating a voice conversion function
EP1895433A1 (en) Method of phase estimation for sinusoidal modelling of a digital signal
Kawahara et al. Spectral envelope recovery beyond the nyquist limit for high-quality manipulation of speech sounds.
FR2910996A1 (en) Acoustic unit coding method for use in e.g. speech synthesis, involves determining interpolation function of spectral envelope model of frame from models of reference frames, and coding unit from modelisation of frames and function
EP1192619B1 (en) Audio coding and decoding by interpolation
Vial Phase retrieval and audio signal reconstruction with non-quadratic cost functions
EP1194923B1 (en) Methods and device for audio analysis and synthesis
EP1192618B1 (en) Audio coding with adaptive liftering
EP1192621B1 (en) Audio encoding with harmonic components
FR2796189A1 (en) AUDIO CODING AND DECODING METHODS AND DEVICES
WO2002082424A1 (en) Method and device for extracting acoustic parameters of a voice signal

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MT NL NO PL PT RO SE SI SK TR

AX Request for extension of the european patent

Extension state: AL BA MK RS

17P Request for examination filed

Effective date: 20081223

17Q First examination report despatched

Effective date: 20090204

AKX Designation fees paid

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MT NL NO PL PT RO SE SI SK TR

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MT NL NO PL PT RO SE SI SK TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

Free format text: NOT ENGLISH

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

REF Corresponds to:

Ref document number: 602008000802

Country of ref document: DE

Date of ref document: 20100429

Kind code of ref document: P

REG Reference to a national code

Ref country code: NL

Ref legal event code: VDEP

Effective date: 20100317

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100617

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

Ref country code: HR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

LTIE Lt: invalidation of european patent or patent extension

Effective date: 20100317

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

REG Reference to a national code

Ref country code: IE

Ref legal event code: FD4D

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100618

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100628

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100717

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100617

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

26N No opposition filed

Effective date: 20101220

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

BERE Be: lapsed

Owner name: FRANCE TELECOM

Effective date: 20110131

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110131

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110131

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20120131

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20120131

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20110114

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PT

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20100317

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20100317

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 8

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 9

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 10

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20161228

Year of fee payment: 10

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20161221

Year of fee payment: 10

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20161219

Year of fee payment: 10

REG Reference to a national code

Ref country code: DE

Ref legal event code: R119

Ref document number: 602008000802

Country of ref document: DE

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20180114

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20180801

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20180131

REG Reference to a national code

Ref country code: FR

Ref legal event code: ST

Effective date: 20180928

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20180114