[go: up one dir, main page]

EP0308817A2 - Method for converting channel vocoder parameters into LPC vocoder parameters - Google Patents

Method for converting channel vocoder parameters into LPC vocoder parameters Download PDF

Info

Publication number
EP0308817A2
EP0308817A2 EP88115139A EP88115139A EP0308817A2 EP 0308817 A2 EP0308817 A2 EP 0308817A2 EP 88115139 A EP88115139 A EP 88115139A EP 88115139 A EP88115139 A EP 88115139A EP 0308817 A2 EP0308817 A2 EP 0308817A2
Authority
EP
European Patent Office
Prior art keywords
parameters
vocoder
channel
lpc
vocoder parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP88115139A
Other languages
German (de)
French (fr)
Other versions
EP0308817A3 (en
Inventor
Hans Dipl.-Ing. Brandl
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Siemens AG
Siemens Corp
Original Assignee
Siemens AG
Siemens Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Siemens AG, Siemens Corp filed Critical Siemens AG
Publication of EP0308817A2 publication Critical patent/EP0308817A2/en
Publication of EP0308817A3 publication Critical patent/EP0308817A3/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/06Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders

Definitions

  • the invention relates to a method according to the preamble of patent claim 1.
  • Digital narrowband communication networks with low data transmission rates (1-2 kbit / s) are currently being planned.
  • the coding methods used are based either on the principle of the channel vocoder or the linear prediction (LPC vocoder). Communication between the vocoders is only possible if a suitable data transcoding takes place at their interface.
  • the converter required for this should be designed to be as inexpensive as possible and should not deteriorate the speech quality if possible.
  • One way to build a converter is to transform the speech data back into the speech signal and re-encode it.
  • This method is very complex since two analysis units and two synthesis units are required.
  • the analysis quality also deteriorates the speech quality.
  • the deterioration of the speech quality can be avoided by directly re-encoding the data of the different vocoders. This possibility results from the very similar synthesis principle, that of the channel vocoder and the LPC vocoder is applied.
  • the speech signal is generated by an excitation signal that is filtered by a variable filter.
  • the excitation signal consists of a pulse train for voiced sounds and white noise for unvoiced sounds. With the excitation parameters, the pulse frequency and the excitation mode - voiced or unvoiced - are determined.
  • the variable transmission behavior of the filter corresponds to the variable resonance behavior of the human vocal tract. This changes slowly and is reset by filter parameters every 10 to 20 ms.
  • the task of the speech signal analysis of a vocoder is to obtain the excitation parameters and the filter parameters from a speech signal.
  • the LPC vocoder and the channel vocoder differ essentially in the structure of the filter. LPC assumes an all-pole filter and the channel vocoder assumes a filter bank.
  • the analysis methods for determining the corresponding filter parameters differ and there are other filter parameters that are transmitted in the different networks. In contrast, the excitation parameters are basically the same.
  • a recoding process is therefore sought which converts the filter parameters of a filter bank of a channel vocoder into the filter parameters of an all-pole filter of an LPC vocoder.
  • the channel vocoder parameters usually represent a non-equidistantly scanned spectrum in terms of message theory.
  • the power spectrum is now calculated from the amplitude spectrum and transformed into the autocorrelation function (AKF) using the Fourier transformation.
  • the corresponding LPC vocoder parameter set can now be calculated from the AKF in a known manner using the usual methods (eg Levinson recursion) (see H. Hermansky, B. Hanson, H. Witka; "Perceptually based Predictive Analysis of Speech" on ICASSP 85, p. 13.10 conference proceedings).
  • the direct transformation is associated with high technical expenditure. Powerful real-time processors are required to calculate spectra and correlation functions.
  • the invention is based on the object of specifying a method for transcoding channel vocoder parameters into LPC vocoder parameters which requires relatively few arithmetic operations with high accuracy.
  • the starting point are the channel vocoder parameters, which are available, for example, as a power spectrum (see FIG. 1). This range of services is only available in a channel vocoder in a section-wise constant form b k with jumps at the transition points from b k to b k + 1 .
  • b k energy values e j shown where the value e j is the energy in the channel with the number j corresponds.
  • the channel energy corresponds to the power in a 20 ms interval (this is the interval after which new filter parameters are set in each case). This interval is also the transformation interval.
  • a smoothed spectrum a k (see FIG. 2) is formed by folding with a smoothing function g (i, s).
  • Gaussian bell curves or similar functions are suitable for this smoothing function g, for example.
  • the following function is given as an example for the Gaussian bell curve:
  • smoothing functions g are the low-pass functions known from filter theory and digital signal processing.
  • the spread s defines the corner frequencies of the respective low-pass filter.
  • the scatter s can be a function of the current spectral line.
  • a larger scatter s is selected for the smoothing function g (i, s) at higher frequencies and thus wider channels in b k than at lower frequencies. This makes it possible to adjust the smoothing to the sensation of tonality (Bark scale) of the human ear.
  • the "harmony" in speech synthesis can be empirically selected by the choice of the scatter (s).
  • the LPC coefficients are generally calculated from the short-term autocorrelation function (approx. 20 ms), AKF for short, of the speech signal. These AKF, ie their correlation coefficients r i , can also be determined from the power spectrum of the speech signal by the inverse, discrete Fourier transformation.
  • the N spectral lines b l of the raw spectrum can be derived from the channel energy values e j (see FIG. 1)
  • the number of channels and thus also the number of channel energy values e j is around 16-18.
  • the elements of matrix C are calculated only once for a certain vocoder combination in the method according to the invention. Subsequently, only matrix multiplications between the energy vectors E (which contains the parameters) and the matrix C have to be carried out in order to recode the respective speech parameters.
  • the smoothed channel vocoder parameters a p are present at an input 1 of a first memory 2. For example, it will A set of these parameters, in the case of 18 channels, ie 18 values, is written into the first memory 2.
  • the transformation coefficients c ip of the matrix C are calculated and stored in a coefficient memory 3.
  • the channel vocoder parameters a p in the first memory 2 are addressed in succession by a first counter 4.
  • the coefficients c ip are addressed in the coefficient memory 3 according to their index p.
  • the addressed channel vocoder parameters a p and the addressed coefficients c ip are multiplied in a multiplier 5 and added up in a downstream adder 6.
  • the index i of the coefficients c ip is kept constant until the index i has reached its greatest value, for example 17 in formula 8.
  • the sum formed is written into a second memory 7 as LPC parameter l i .
  • the index i is then increased by one by a second counter 8 and the next LPC parameter l i + 1 is calculated.
  • the second counter 8 addresses the coefficients c ip in the coefficient memory 3 on the one hand according to their index i, and on the other hand the LPC vocoder parameters in the second memory 7.
  • the two counters 4 and 8 are clocked by a clock controller 9.
  • a transformed or recoded set of LPC vocoder parameters can then be removed at an output 10 of the second memory 7.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Transmission Systems Not Characterized By The Medium Used For Transmission (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)
  • Carbon Steel Or Casting Steel Manufacturing (AREA)

Abstract

The invention relates to a method for converting channel vocoder parameters into LPC vocoder parameters, the LPC vocoder parameters being calculated from the smoothed power spectrum of the channel vocoder parameters by an inverse discrete Fourier transformation. According to the invention, for a prescribed channel number of the channel vocoder and a prescribed parameter number of the LPC vocoder, matrix elements are calculated from the variables, which are constant in this case, and the LPC vocoder parameters are computed from the channel vocoder parameters by matrix multiplications. <IMAGE>

Description

Die Erfindung betrifft ein Verfahren gemäß dem Oberbegriff des Patentanspruchs 1.The invention relates to a method according to the preamble of patent claim 1.

Derzeit werden digitale Schmalband-Kommunikationsnetze mit niedrigen Datenübertragungsraten (1-2 kbit/s) geplant. Die hierbei angewandten Codierungsverfahren bauen entweder auf dem Prinzip des Kanalvocoders oder der linearen Prädiktion (LPC-­Vocoder) auf. Eine Kommunikation zwischen den Vocodern ist nur möglich, falls an ihrer Schnittstelle eine geeignete Da­tenumcodierung erfolgt.Digital narrowband communication networks with low data transmission rates (1-2 kbit / s) are currently being planned. The coding methods used are based either on the principle of the channel vocoder or the linear prediction (LPC vocoder). Communication between the vocoders is only possible if a suitable data transcoding takes place at their interface.

Der hierzu benötigte Umsetzer soll möglichst aufwandgünstig gestaltet sein und die Sprachqualität möglichst nicht ver­schlechtern.The converter required for this should be designed to be as inexpensive as possible and should not deteriorate the speech quality if possible.

Eine Möglichkeit, einen Umsetzer aufzubauen, besteht in der Rücktransformation der Sprachdaten in das Sprachsignal und dessen Neucodierung.One way to build a converter is to transform the speech data back into the speech signal and re-encode it.

Dieses Verfahren ist sehr aufwendig, da zwei Analyseeinheiten und zwei Syntheseeinheiten benötigt werden. Durch Analysefeh­ler verschlechtert sich außerdem die Sprachqualität. Die Ver­schlechterung der Sprachqualität läßt sich durch direkte Umcodierung der Daten der verschiedenen Vocoder umgehen. Die­se Möglichkeit ergibt sich aus dem sehr ähnlichen Synthese­prinzip, das bei dem Kanalvocoder und dem LPC-Vocoder angewandt wird.This method is very complex since two analysis units and two synthesis units are required. The analysis quality also deteriorates the speech quality. The deterioration of the speech quality can be avoided by directly re-encoding the data of the different vocoders. This possibility results from the very similar synthesis principle, that of the channel vocoder and the LPC vocoder is applied.

Das Sprachsignal wird hierbei durch ein Anregungssignal, wel­ches durch ein variables Filter gefiltert wird, erzeugt. Das Anregungssignal besteht bei stimmhaften Lauten aus einer Pulsfolge und bei stimmlosen Lauten aus weißem Rau­schen. Mit den Anregungsparametern wird die Pulsfrequenz und der Anregungsmodus - stimmhaft oder stimmlos - fest­gelegt. Das variable Übertragungsverhalten des Filters entspricht dem variablen Resonanzverhalten des menschli­chen Vokaltraktes. Dieses ändert sich langsam und wird durch Filterparameter alle 10 bis 20 ms neu eingestellt. Aufgabe der Sprachsignal-Analyse eines Vocoders ist es, aus einem Sprachsignal die Anregungsparameter und die Fil­terparameter zu gewinnen. Der LPC-Vocoder und der Kanal­vocoder unterscheiden sich im wesentlichen in der Struktur des Filters. LPC geht von einem Allpolfilter und der Kanalvocoder von einer Filterbank aus. Damit unterscheiden sich die Analyseverfahren zur Bestimmung der entsprechenden Filterparameter und es ergeben sich andere Filterparameter, die in den verschiedenen Netzen übertragen werden. Dagegen sind die Anregungsparameter im Prinzip die gleichen.The speech signal is generated by an excitation signal that is filtered by a variable filter. The excitation signal consists of a pulse train for voiced sounds and white noise for unvoiced sounds. With the excitation parameters, the pulse frequency and the excitation mode - voiced or unvoiced - are determined. The variable transmission behavior of the filter corresponds to the variable resonance behavior of the human vocal tract. This changes slowly and is reset by filter parameters every 10 to 20 ms. The task of the speech signal analysis of a vocoder is to obtain the excitation parameters and the filter parameters from a speech signal. The LPC vocoder and the channel vocoder differ essentially in the structure of the filter. LPC assumes an all-pole filter and the channel vocoder assumes a filter bank. The analysis methods for determining the corresponding filter parameters differ and there are other filter parameters that are transmitted in the different networks. In contrast, the excitation parameters are basically the same.

Es wird also ein Umcodierverfahren gesucht, welches die Filterparameter einer Filterbank eines Kanalvocoders in die Filterparameter eines Allpolfilters eines LPC-­Vocoders umwandelt.A recoding process is therefore sought which converts the filter parameters of a filter bank of a channel vocoder into the filter parameters of an all-pole filter of an LPC vocoder.

Die Kanalvocoder-Parameter (oder Koeffizienten) stellen nachrichtentheoretisch meist ein nicht-äquidistant abge­tastetes Spektrum dar. Aus dem Amplitudenspektrum wird nun das Leistungsspektrum berechnet und mit Hilfe der Fourier­transformation in die Autokorrelationsfunktion (AKF) transformiert. Aus der AKF kann nun in bekannter Weise mit Hilfe der üblichen Verfahren (z.B. Levinson-Rekursion) der entsprechende LPC-Vocoder-Parametersatz berechnet werden (siehe H. Hermansky, B. Hanson, H. Witka; "Per­ceptually based Predictive Analysis of Speech" on ICASSP 85, S. 13.10 Tagungsband).The channel vocoder parameters (or coefficients) usually represent a non-equidistantly scanned spectrum in terms of message theory. The power spectrum is now calculated from the amplitude spectrum and transformed into the autocorrelation function (AKF) using the Fourier transformation. The corresponding LPC vocoder parameter set can now be calculated from the AKF in a known manner using the usual methods (eg Levinson recursion) (see H. Hermansky, B. Hanson, H. Witka; "Perceptually based Predictive Analysis of Speech" on ICASSP 85, p. 13.10 conference proceedings).

Die direkte Transformation ist mit hohem technischen Aufwand verbunden. Es werden leistungsfähige Real-time-­Prozessoren zur Berechnung von Spektren und Korrelations­funktionen benötigt.The direct transformation is associated with high technical expenditure. Powerful real-time processors are required to calculate spectra and correlation functions.

Der Erfindung liegt die Aufgabe zugrunde, ein Verfahren zur Umcodierung von Kanalvocoder-Parameter in LPC-Vocoder-Para­meter anzugeben, das bei hoher Genauigkeit relativ wenige Rechenoperationen benötigt.The invention is based on the object of specifying a method for transcoding channel vocoder parameters into LPC vocoder parameters which requires relatively few arithmetic operations with high accuracy.

Diese Aufgabe wird erfindungsgemäß durch die im Patentan­spruch 1 angegebenen Merkmale gelöst.This object is achieved by the features specified in claim 1.

Im folgenden wird ein bekanntes Verfahren zur Umcodierung anhand der mathematischen Methoden erläutert.A known method for recoding is explained below using the mathematical methods.

Ausgangspunkt sind die Kanalvocoder-Parameter, die beispiels­weise als Leistungsspektrum vorliegen (siehe FIG 1). Dieses Leistungsspektrum liegt bei einem Kanalvocoder nur in einer abschnittsweisen konstanten Form bk mit Sprüngen an den Über­gangsstellen von bk nach bk+1 vor. In FIG 1 sind als diese Parameter bk Energiewerte ej dargestellt, wobei der Wert ej der Energie im Kanal mit der Nummer j entspricht. Hierbei entspricht in allgemein bekannter Weise die Kanalenergie der Leistung in einem 20 ms-Intervall (dies ist das Inter­vall, nach dem jeweils neue Filterparameter eingestellt werden). Dieses Intervall ist auch gleichzeitig das Transformations­intervall.The starting point are the channel vocoder parameters, which are available, for example, as a power spectrum (see FIG. 1). This range of services is only available in a channel vocoder in a section-wise constant form b k with jumps at the transition points from b k to b k + 1 . In FIG 1, as these parameters b k energy values e j shown, where the value e j is the energy in the channel with the number j corresponds. In a generally known manner, the channel energy corresponds to the power in a 20 ms interval (this is the interval after which new filter parameters are set in each case). This interval is also the transformation interval.

Aus diesem "rohen" Spektrum bk wird durch Faltung mit einer Glättungsfunktion g (i, s) ein geglättetes Spektrum ak (sie­he FIG 2) gebildet. Die Glättungsfunktion g ist eine gerade Funktion, g (i, s) = g (-i, s), mit i als Argument und mit s als Streuung, durch die die Breite der Glättungsfunktion g gegeben ist.From this "raw" spectrum b k , a smoothed spectrum a k (see FIG. 2) is formed by folding with a smoothing function g (i, s). The smoothing function g is an even one Function, g (i, s) = g (-i, s), with i as an argument and with s as a scattering, by which the width of the smoothing function g is given.

Für diese Glättungsfunktion g eignen sich beispielsweise Gauß'sche Glockenkurven oder ähnliche Funktionen. Als Bei­spiel für die Gauß'sche Glockenkurve wird folgende Funktion angegeben:Gaussian bell curves or similar functions are suitable for this smoothing function g, for example. The following function is given as an example for the Gaussian bell curve:

Weitere mögliche Glättungsfunktionen g sind die aus der Fil­tertheorie und der digitalen Signalverarbeitung bekannten Tiefpaßfunktionen. In diesen Fällen definiert die Streuung s die Eckfrequenzen des jeweiligen Tiefpasses.Further possible smoothing functions g are the low-pass functions known from filter theory and digital signal processing. In these cases, the spread s defines the corner frequencies of the respective low-pass filter.

Für den Spezialfall eines Diracimpulses

Figure imgb0001
würde bk unverändert auf das geglättete Spektrum ak abgebil­tet werden.For the special case of a Dirac pulse
Figure imgb0001
b k would still be mapped onto the smoothed spectrum a k .

Bei der Glättung eines realen Sprachspektrums (bk) kann die Streuung s eine Funktion der aktuellen Spektrallinie sein. In diesem Fall wird bei höheren Frequenzen und damit breiteren Kanälen in bk eine größere Streuung s für die Glättungsfunk­tion g (i, s) gewählt als bei tieferen Frequenzen. Damit ist eine Anpassung der Glättung an die Tonheitsempfindung (Bark - Skala) des menschlichen Ohres möglich. Über die Wahl des oder der Streuungen s ist der "Wohlklang" bei der Sprachsynthese empirisch wählbar.When smoothing a real speech spectrum (b k ), the scatter s can be a function of the current spectral line. In this case, a larger scatter s is selected for the smoothing function g (i, s) at higher frequencies and thus wider channels in b k than at lower frequencies. This makes it possible to adjust the smoothing to the sensation of tonality (Bark scale) of the human ear. The "harmony" in speech synthesis can be empirically selected by the choice of the scatter (s).

Für die Berechnung des geglätteten Spektrums ak aus dem "Roh"-Spektrum bk ergibt sich somit folgende Formel:

Figure imgb0002
mit g : Glättungsfunktion
u : Glättungsbreite (Normierung)
u . k : Streuung
ak: K-ter Koeffizient des geglätteten Leistungsspektrums
N : Anzahl der Spektralkoeffizienten
bl: l-ter Koeffizient des Rohspektrums.The following formula thus results for the calculation of the smoothed spectrum a k from the "raw" spectrum b k :
Figure imgb0002
with g: smoothing function
u: smoothing width (normalization)
u. k: scatter
a k : Kth coefficient of the smoothed power spectrum
N: number of spectral coefficients
b l : lth coefficient of the raw spectrum.

Die LPC-Koeffizienten werden i. a. aus der Kurzzeit-Autokor­relationsfunktion (ca. 20 ms), kurz AKF genannt, des Sprach­signals errechnet. Diese AKF, d.h. deren Korrelationskoeffi­zienten ri lassen sich auch aus dem Leistungsspektrum des Sprachsignals durch die inverse, diskrete Fouriertransformation bestimmen.The LPC coefficients are generally calculated from the short-term autocorrelation function (approx. 20 ms), AKF for short, of the speech signal. These AKF, ie their correlation coefficients r i , can also be determined from the power spectrum of the speech signal by the inverse, discrete Fourier transformation.

Für die M Korrelationskoeffizienten ri ergeben sich dann folgende Gleichungen:

Figure imgb0003
i = 0,1 ... M, Anzahl der Korrelationskoeffizienten (sonst wie in Formel (1)).The following equations then result for the M correlation coefficients r i :
Figure imgb0003
i = 0.1 ... M, number of correlation coefficients (otherwise as in formula (1)).

Formel (1) in Formel (2) eingesetzt ergibt nach Anwendung des Kommutativgesetzes:

Figure imgb0004
Formula (1) used in formula (2) results in application of the commutative law:
Figure imgb0004

Die N Spektrallinien bl des Rohspektrums lassen sich von den Kanalenergiewerten ej ableiten (siehe FIG 1)The N spectral lines b l of the raw spectrum can be derived from the channel energy values e j (see FIG. 1)

Bei realen Vocodern liegen die Kanalzahlen und damit auch die Anzahl der Kanalenergiewerte ej bei etwa 16-18. Für die An­zahl der Spektralkoeffizienten N im Bereich von etwa 256 lassen sich die Koeffizienten bk des "rohen" Leistungsspektrums folgendermaßen darstellen:

(4) bl = ei für l = mj....(mj+1-1)
mj : Index der ersten Spektrallinie des Kanals j
mj+1-1 : Index der letzten Spektrallinie des Kanals j
In real vocoders, the number of channels and thus also the number of channel energy values e j is around 16-18. For the number of spectral coefficients N in the range of approximately 256, the coefficients b k of the "raw" power spectrum can be represented as follows:

(4) b l = e i for l = m j .... (m j + 1 -1)
m j : index of the first spectral line of channel j
m j + 1 -1: index of the last spectral line of channel j

Formel (4) eingesetzt in Formel (3) ergibt folgende allgemei­ne Gleichung zur Berechnung der AKF aus den Vocoder-Kanalener­giewerten
ej mit j = l-P

Figure imgb0005
m = l erste Spektrallinien des ersten Kanals
mp= N letzte Spektrallinie des letzten KanalsFormula (4) used in formula (3) gives the following general equation for calculating the AKF from the vocoder channel energy values
e j with j = lP
Figure imgb0005
m = l first spectral lines of the first channel
m p = N last spectral line of the last channel

Im folgenden wird das erfindungsgemäße Verfahren zur Umco­dierung erläutert.The method for recoding is explained below.

Alle Elemente nach den Vocoder-Kanalenergiewerten ej sind Konstante.All elements after the vocoder channel energy values e j are constant.

Für ein vorgegebenes Frequenz- und Zeitraster, hinsichtlich der Kanalvocoder- und der LPC-Vocoder-Parameter, läßt sich die Formel (5) in eine Matrixmultiplikation umschreiben:

Figure imgb0006
i = 0...M: Koeffizienten der AKF
P : Kanalzahl
mit:
Figure imgb0007
oder in Matrix-Schreibweise
Figure imgb0008
= C ×
Figure imgb0009


mit
Figure imgb0010
= AKF-Vektor
C : Matrix mit den Elementen aus Formel (7)
Figure imgb0011
: Kanalvocoder-EnergievektorFor a given frequency and time grid, with regard to the channel vocoder and the LPC vocoder parameters, the formula (5) can be described as a matrix multiplication:
Figure imgb0006
i = 0 ... M: coefficients of the AKF
P: number of channels
With:
Figure imgb0007
or in matrix notation
Figure imgb0008
= C ×
Figure imgb0009


With
Figure imgb0010
= AKF vector
C: matrix with the elements from formula (7)
Figure imgb0011
: Channel vocoder energy vector

Zur Umcodierung werden beim erfindungsgemäßen Verfahren nur einmal die Elemente der Matrix C für eine bestimmte Vo­coder-Kombination berechnet. Anschließend sind zur Umcodie­rung der jeweiligen Sprach-Parameter nur noch Matrixmultipli­kationen zwischen den Energievektoren E (der die Parameter enthält) und der Matrix C auszuführen.For the transcoding, the elements of matrix C are calculated only once for a certain vocoder combination in the method according to the invention. Subsequently, only matrix multiplications between the energy vectors E (which contains the parameters) and the matrix C have to be carried out in order to recode the respective speech parameters.

Für einen praktischen Fall mit beispielsweise P=18 Kanälen eines Kanalvocoders und einer gewünschten Zahl von 11 Auto­korrelationswerten für LPC-10 sind somit nur noch ca 200 Multiplikationen und etwa ebensoviele Additionen nötig. Bei konventionellen Verfahren werden ca. 4000 Rechenoperationen benötigt.For a practical case with, for example, P = 18 channels of a channel vocoder and a desired number of 11 autocorrelation values for LPC-10, only about 200 multiplications and about as many additions are required. Conventional methods require approximately 4000 arithmetic operations.

Im folgenden wird anhand von FIG 3 eine Schaltungsanordnung zur Durchführung der vorstehend beschriebenen Matrixmultipli­kation erläutert.A circuit arrangement for carrying out the matrix multiplication described above is explained below with reference to FIG. 3.

An einem Eingang 1 eines ersten Speichers 2 liegen die geglät­teten Kanalvocoder-Parameter ap an. Es wird beispielsweise jeweils ein Satz dieser Parameter, bei 18 Kanälen also 18 Werte, in den ersten Speicher 2 eingeschrieben.The smoothed channel vocoder parameters a p are present at an input 1 of a first memory 2. For example, it will A set of these parameters, in the case of 18 channels, ie 18 values, is written into the first memory 2.

Es soll folgende Rechenoperation ausgeführt werden:

Figure imgb0012
mit li : LPC-Vocoder-Parameter (diese entsprechen den Auto­korrelationskoeffizienten ri in Formel (6))
cip: Transformationskoeffizienten (Matrixelemente), berechnet nach Formel (7)
ap: Kanalvocoder-ParameterThe following arithmetic operation is to be carried out:
Figure imgb0012
with l i : LPC vocoder parameters (these correspond to the autocorrelation coefficients r i in formula (6))
c ip : transformation coefficients (matrix elements), calculated according to formula (7)
a p : Channel vocoder parameters

Hierbei werden für eine Umcodierung der Parameter eines vorge­gegebenen Kanalvocoders in Parameter eines vorgegebenen LPC-­Vocoders die Transformationskoeffizienten cip der Matrix C berechnet und in einem Koeffizientenspeicher 3 abgelegt.For a recoding of the parameters of a given channel vocoder into parameters of a given LPC vocoder, the transformation coefficients c ip of the matrix C are calculated and stored in a coefficient memory 3.

Zur Durchführung der Matrixmultiplikation werden die Kanalvo­coder-Parameter ap im ersten Speicher 2 von einem ersten Zähler 4 nacheinander adressiert. Analog dazu werden die Koeffizienten cip im Koeffizientenspeicher 3 nach ihrem Index p adressiert.To carry out the matrix multiplication, the channel vocoder parameters a p in the first memory 2 are addressed in succession by a first counter 4. Similarly, the coefficients c ip are addressed in the coefficient memory 3 according to their index p.

In einem Multiplizierer 5 werden die adressierten Kanalvocoder-­Parameter ap und die adressierten Koeffizienten cip multipli­ziert, und in einem nachgeschalteten Addierer 6 aufsummiert. Hierbei wird der Index i der Koeffizienten cip solange kon­stant gehalten, bis der Index i seinen größten Wert, in For­mel 8 beispielsweise 17, erreicht hat. Die gebildete Summe wird als LPC-Parameter li in einen zweiten Speicher 7 ein­geschrieben. Danach wird von einem zweiten Zähler 8 der Index i um eins erhöht, und der nächste LPC-Parameter li+1 berechnet. Hierzu adressiert der zweite Zähler 8 zum einen die Koeffi­zienten cip im Koeffizientenspeicher 3 nach ihrem Index i, und zum anderen die LPC-Vocoder-Parameter im zweiten Speicher 7.The addressed channel vocoder parameters a p and the addressed coefficients c ip are multiplied in a multiplier 5 and added up in a downstream adder 6. Here, the index i of the coefficients c ip is kept constant until the index i has reached its greatest value, for example 17 in formula 8. The sum formed is written into a second memory 7 as LPC parameter l i . The index i is then increased by one by a second counter 8 and the next LPC parameter l i + 1 is calculated. For this purpose, the second counter 8 addresses the coefficients c ip in the coefficient memory 3 on the one hand according to their index i, and on the other hand the LPC vocoder parameters in the second memory 7.

Die beiden Zähler 4 und 8 werden von einer Taktsteuerung 9 getaktet.The two counters 4 and 8 are clocked by a clock controller 9.

An einem Ausgang 10 des zweiten Speichers 7 ist dann jeweils ein transformierter bzw. umcodierter Satz von LPC-Vocoder-Parametern abnehmbar.A transformed or recoded set of LPC vocoder parameters can then be removed at an output 10 of the second memory 7.

Claims (3)

1. Verfahren zur Umcodierung von digitalen Kanalvocoder-­Parametern, die im Analyseteil des Kanalvocoders aus einem natürlichen Sprachsignal gewonnen wurden, in digitale LPC-Vocoder-Parameter, die im Syntheseteil des LPC-Vocoders zu einem synthetischen Sprachsignal verarbeitet werden, wobei die Kanalvocoder-Parameter als Leistungsspektrum vorliegen, wobei die LPC-Vocoder-Parameter aus der Kurzzeit-Autokorre­lationsfunktion berechnet werden, wobei das Leistungsspektrum mit einer Glättungsfunktion (g) geglättet wird,und wobei aus dem geglätteten Leistungsspektrum durch eine inverse, diskrete Fouriertransformation die Korrelationskoeffizienten der Auto­korrelationsfunktion errechnet werden, dadurch gekennzeichnet, daß bei vorgegebener Kanalzahl des Kanal-Vocoders und bei vorgegebener Parameterzahl des LPC-Vocoders bei einem vorgegebenen Frequenz- und Zeitraster aus den hierbei konstanten Größen Matrixelemente (cij) errechnet und in einem Koeffizientenspeicher (3) abgelegt werden, so daß die LPC-Vocoder-Parameter durch Matrixmultipli­kationen aus den Kanalvocoder-Parametern ableitbar sind, wobei jeweils die Parameter eines der Vocoder einen Vektor bilden.1. A method for recoding digital channel vocoder parameters, which were obtained in the analysis part of the channel vocoder from a natural speech signal, into digital LPC vocoder parameters, which are processed in the synthesis part of the LPC vocoder to a synthetic speech signal, the channel vocoder parameters are present as a power spectrum, the LPC vocoder parameters being calculated from the short-term autocorrelation function, the power spectrum being smoothed using a smoothing function (g), and the correlation coefficients of the autocorrelation function being calculated from the smoothed power spectrum using an inverse, discrete Fourier transformation, characterized in that for a given number of channels of the channel vocoder and for a given number of parameters of the LPC vocoder with a given frequency and time grid matrix elements (c ij ) are calculated from the constant quantities and stored in a coefficient memory (3), so that di e LPC vocoder parameters can be derived from the channel vocoder parameters by means of matrix multiplications, the parameters of one of the vocoders in each case forming a vector. 2. Verfahren nach Anspruch 1, dadurch gekenn­zeichnet, daß die Glättungsfunktion (g= g (i, s)) eine Streuung (s) beinhaltet, durch die die Breite der Glättungs­funktion gegeben ist.2. The method according to claim 1, characterized in that the smoothing function (g = g (i, s)) includes a scatter (s) by which the width of the smoothing function is given. 3. Verfahren nach Anspruch 1 oder Anspruch 2, dadurch gekennzeichnet, daß die Breite (s) der Glät­tungsfunktion (g) eine Funktion der Parameter des Kanal-­Vocoders ist.3. The method according to claim 1 or claim 2, characterized in that the width (s) of the smoothing function (g) is a function of the parameters of the channel vocoder.
EP88115139A 1987-09-23 1988-09-15 Method for converting channel vocoder parameters into lpc vocoder parameters Withdrawn EP0308817A3 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
DE19873732047 DE3732047A1 (en) 1987-09-23 1987-09-23 METHOD FOR RECODING CHANNEL VOCODER PARAMETERS IN LPC VOCODER PARAMETERS
DE3732047 1987-09-23

Publications (2)

Publication Number Publication Date
EP0308817A2 true EP0308817A2 (en) 1989-03-29
EP0308817A3 EP0308817A3 (en) 1990-04-18

Family

ID=6336687

Family Applications (1)

Application Number Title Priority Date Filing Date
EP88115139A Withdrawn EP0308817A3 (en) 1987-09-23 1988-09-15 Method for converting channel vocoder parameters into lpc vocoder parameters

Country Status (2)

Country Link
EP (1) EP0308817A3 (en)
DE (1) DE3732047A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0626675A1 (en) * 1993-05-28 1994-11-30 Motorola Inc. Excitation synchronous time encoding vocoder and method
WO1995022819A1 (en) * 1994-02-16 1995-08-24 Qualcomm Incorporated Vocoder asic
WO1996031873A1 (en) * 1995-04-03 1996-10-10 Universite De Sherbrooke Predictive split-matrix quantization of spectral parameters for efficient coding of speech
AU725711B2 (en) * 1994-02-16 2000-10-19 Qualcomm Incorporated Block normalisation processor

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0138073A1 (en) * 1983-09-29 1985-04-24 Siemens Aktiengesellschaft Data converter for interfacing LPC and channel vocoders for the transmission of digital speech signals with narrow-band transmission systems

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0138073A1 (en) * 1983-09-29 1985-04-24 Siemens Aktiengesellschaft Data converter for interfacing LPC and channel vocoders for the transmission of digital speech signals with narrow-band transmission systems

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0626675A1 (en) * 1993-05-28 1994-11-30 Motorola Inc. Excitation synchronous time encoding vocoder and method
US5784532A (en) * 1994-02-16 1998-07-21 Qualcomm Incorporated Application specific integrated circuit (ASIC) for performing rapid speech compression in a mobile telephone system
EP0758123A3 (en) * 1994-02-16 1997-03-12 Qualcomm Incorporated Block normalization processor
US5727123A (en) * 1994-02-16 1998-03-10 Qualcomm Incorporated Block normalization processor
WO1995022819A1 (en) * 1994-02-16 1995-08-24 Qualcomm Incorporated Vocoder asic
AU697822B2 (en) * 1994-02-16 1998-10-15 Qualcomm Incorporated Vocoder asic
US5926786A (en) * 1994-02-16 1999-07-20 Qualcomm Incorporated Application specific integrated circuit (ASIC) for performing rapid speech compression in a mobile telephone system
AU725711B2 (en) * 1994-02-16 2000-10-19 Qualcomm Incorporated Block normalisation processor
SG87819A1 (en) * 1994-02-16 2002-04-16 John G Mcdonough Vocoder asic
CN100397484C (en) * 1994-02-16 2008-06-25 高通股份有限公司 digital signal processor
WO1996031873A1 (en) * 1995-04-03 1996-10-10 Universite De Sherbrooke Predictive split-matrix quantization of spectral parameters for efficient coding of speech
US5664053A (en) * 1995-04-03 1997-09-02 Universite De Sherbrooke Predictive split-matrix quantization of spectral parameters for efficient coding of speech
CN1112674C (en) * 1995-04-03 2003-06-25 舍布鲁克大学 Predictive split-matrix quantization of spectral parameters for efficient coding of speech

Also Published As

Publication number Publication date
EP0308817A3 (en) 1990-04-18
DE3732047A1 (en) 1989-04-06
DE3732047C2 (en) 1992-10-29

Similar Documents

Publication Publication Date Title
DE60317722T2 (en) Method for reducing aliasing interference caused by the adjustment of the spectral envelope in real value filter banks
EP1979901B1 (en) Method and arrangements for audio signal encoding
DE69529356T2 (en) Waveform interpolation by breaking it down into noise and periodic signal components
DE69608947T2 (en) Method of analyzing an audio frequency signal by linear prediction, and application to a method of encoding and decoding an audio frequency signal
DE69518452T2 (en) Procedure for the transformation coding of acoustic signals
EP1825461B1 (en) Method and apparatus for artificially expanding the bandwidth of voice signals
DE69916321T2 (en) CODING OF AN IMPROVEMENT FEATURE FOR INCREASING PERFORMANCE IN THE CODING OF COMMUNICATION SIGNALS
DE69634645T2 (en) Method and apparatus for speech coding
DE60226308T2 (en) Quantization of the excitation in a generalized noise-shaping noise feedback coding system
DE60218385T2 (en) Post-filtering of coded speech in the frequency domain
DE69230308T2 (en) Transformation processing apparatus and method and medium for storing compressed digital data
DE3853916T2 (en) DIGITAL VOICE ENCODER WITH IMPROVED VERTOR EXCITATION SOURCE.
DE60029990T2 (en) SMOOTHING OF THE GAIN FACTOR IN BROADBAND LANGUAGE AND AUDIO SIGNAL DECODER
DE69810361T2 (en) Method and device for multi-channel acoustic signal coding and decoding
DE69317958T2 (en) Low delay audio signal encoder using analysis-by-synthesis techniques
DE60126149T2 (en) METHOD, DEVICE AND PROGRAM FOR CODING AND DECODING AN ACOUSTIC PARAMETER AND METHOD, DEVICE AND PROGRAM FOR CODING AND DECODING SOUNDS
DE69729527T2 (en) Method and device for coding speech signals
DE69426860T2 (en) Speech coder and method for searching codebooks
EP1525576B1 (en) Arrangement and method for the generation of a complex spectral representation of a time-discrete signal
DE2524497A1 (en) PHASE VOCODER SPEECH SYNTHESIS SYSTEM
EP1016319B1 (en) Process and device for coding a time-discrete stereo signal
DE69033510T2 (en) NUMERIC LANGUAGE ENCODER WITH IMPROVED LONG-TERM FORECASTING BY SUBSAMPLE RESOLUTION
DE69708191T2 (en) Signal coding device
DE69420682T2 (en) Speech decoder
DE69028434T2 (en) System for encoding broadband audio signals

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A2

Designated state(s): DE FR GB IT NL

PUAL Search report despatched

Free format text: ORIGINAL CODE: 0009013

AK Designated contracting states

Kind code of ref document: A3

Designated state(s): DE FR GB IT NL

17P Request for examination filed

Effective date: 19900307

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

18W Application withdrawn

Withdrawal date: 19901207