CELP
CELP
CELP
Aneek Anwar Zaeem Varaich Bilal Hassan Hashim Bhatti 2012-MS-EE-067 2012-MS-EE-078 2012-MS-EE-075 2008-MS-EE-116
Introduction
CELP is a speech coding algorithm proposed by Schroeder and Atal One of the most widely used speech coding algorithm for lossy compression of speech Based on the idea of Linear Prediction (LPC) Used as a generic term for variety of codecs like
MPEG-4 Part 3 (CELP as an MPEG-4 Audio Object Type) G.728 - Coding of speech at 16 kbit/s using low-delay code excited linear prediction G.718 - uses CELP for the lower two layers for the band (506400 Hz) in a two stage coding structure G.729.1 - uses CELP coding for the lower band (504000 Hz) in a three-stage coding structure
Short-time overlapping frames through windowing the signal All the subsequent processing is done on these frames
LPCs contd.
We can predict the next sample using linear combination of previous p samples, thus called linear prediction
s(n) = a1s(n-1) + a2s(n-2) aps(n-p)
or
LPCs contd.
Taking the z-transform, we get
So S(z) will be given as S(z) = E(z) / A(z) = E(z) H(z) where H(z) = 1/A(z) So we just need to find the coefficients ak to model the filter ak can be computed by using least-square criterion or Levinson Durbin algorithm
(n) b x (n M ). x
Vector Quantization
Vector quantization (VQ) allows the modeling of probability density functions by the distribution of prototype vectors. It works by dividing a large set of points (vectors) into groups having approximately the same number of points closest to them. Each group is represented by its centroid point, as in k-means and some other clustering algorithms. Since data points are represented by the index of their closest centroid, commonly occurring data have low error, and rare data high error. VQ is quite suitable for lossy data compression.
x x0
x1 xN 1
k into a finite A vector quantizer maps k-dimensional vectors in the vector space R T set of vectors x x x x 0 1 N 1 T Unquantized vector: y y0 y1 yN 1 Quantized vector: y VQx ri , x Ci Reconstruction vector (codeword): r i Codebook: the set of all the codewords: Voronoi region: nearest neighbor region
10
11
12
CELP
CELP is based on the previously discussed concepts
An LPC filter is used to model the vocal tract along with the Long Term Prediction Error signal which acts as excitation signal in source filter model is quantized using VQ - both fixed and adaptive codebooks are used A weighted perceptual filter is added to reduce the noise
CELP algorithm
Encoding
LPC analysis H(z) Define perceptual weighting filter. This permits more noise at formant frequencies where it will be masked by the speech Synthesize speech using each codebook entry in turn as the input to V(z)
Calculate optimum gain to minimize perceptually weighted error energy in speech frame Select codebook entry that gives lowest error
Decoding
Receive LPC parameters and codebook index Resynthesize speech using H(z) and codebook entry
1/B(z) g
1/A(z)
LS criterion
where
1/B(z) represents a Long Term Prediction filter 1/A(z) is the LPC filter g is the gain
1/B(z) gain
LS criterion
1/A(z)
W(z)
Error q
1/A(f) W(f)
5 4 3 2 1
LPC Analysis
0 0
s
1/B(z) 1/A(z) Gain g
Synthetic speech
M1
Waveforms codebook
Search for the best code and gain Iteration on the whole codebook
+
W(z)
p 0
H filter memory
Adaptive codebook
g1
H(z)
p 1 +
c 1,i(0)
p e
Least Square
Stochastic codebook
g2
H(z)
p 2
c 2,i(1)
Transmitted Parameters
Adaptive Codebook Gain and Index Fixed Codebook Gain and Index LPC filter coefficients
CELP Decoder
Decode received parameters: Index of stochastic codebook Gain of stochastic codebook Index of adaptive codebook Gain of adaptive codebook Linear prediction filter coefficients Adaptive codebook
g1 c 1,i(0)
1/A(z)
Synthetic speech
Stochastic codebook
g2 c 2,i(1)
2.5 30 75
30 25 12 24
Compression Ratio
For normal PCM speech, we use 8bits per sample at the sampling rate of 8kHz data rate = 64kbps For various CELP standards, data rate can be as low as 6kbps for the same signal So
COMPRESSION RATIO = 64/6 = 10
References
B.S. Atal, "The History of Linear Prediction," IEEE Signal Processing Magazine, vol. 23, no. 2, March 2006, pp. 154161. M. R. Schroeder and B. S. Atal, "Code-excited linear prediction (CELP): high-quality speech at very low bit rates," in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), vol. 10, pp. 937940, 1985. Digital Processing of Speech Signals. L. R. Rabiner and R. W. Schafer. Prentice-Hall (Signal Processing Series), 1978.