In this chapter we describe a Bayesian approach to audio source separation. The approach relies on probabilistic modeling of sound sources as (sparse) linear combinations of atoms from a dictionary and Markov chain Monte Carlo (MCMC) inference. Several prior distributions are considered for the source expansion coefficients. We first consider independent and identically distributed (iid) general priors with two choices of distributions. The first one is the Student t, which is a good model for sparsity when the shape parameter has a low value. The second one is a hierarchical mixture distribution; conditionally upon an indicator variable, one coefficient is either set to zero or given a normal distribution, whose variance is in turn given an inverted-Gamma distribution. Then, we consider more audiospecific models where both the identically distributed and independently distributed assumptions are lifted. Using a Modified Discrete Cosine Transform (MDCT) dictionary, a time–frequency orthonormal basis, we describe frequency-dependent structured priors which explicitly model the harmonic structure of sound, using a Markov hierarchical modeling of the expansion coefficients. Separation results are given for a stereophonic recording of three sources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
K. H. Knuth, “Bayesian source separation and localization,” in SPIE’98: Bayesian Inference for Inverse Problems, San Diego, Jul. 1998, pp. 147-158.
——, “A Bayesian approach to source separation,” in Proc. 1st International Workshop on Independent Component Analysis and Signal Separation, Aussois, France, Jan. 1999, pp. 283-288.
A. Mohammad-Djafari, “A Bayesian approach to source separation,” in Proc. 19th International Workshop on Bayesian Inference and Maximum Entropy Methods (MaxEnt99), Boise, USA, Aug. 1999.
A. J. Bell and T. J. Sejnowski, “An information-maximization approach to blind separation and blind deconvolution,” Neural Computation, vol. 7, no. 6, pp. 1129-1159, 1995.
J.-F. Cardoso, “Blind signal separation: statistical principles,” Proceedings of the IEEE. Special issue on blind identification and estimation, vol. 9, no. 10, pp. 2009-2025, Oct. 1998.
B. A. Olshausen and K. J. Millman, “Learning sparse codes with a mixture-of-Gaussians prior,” in Advances in Neural Information Processing Systems, S. A. Solla and T. K. Leen, Eds. MIT press, 2000, pp. 841-847.
M. S. Lewicki and T. J. Sejnowski, “Learning overcomplete representations,” Neural Computations, vol. 12, pp. 337-365, 2000.
M. Girolami, “A variational method for learning sparse and overcomplete rep-resentations,” Neural Computation, vol. 13, no. 11, pp. 2517-2532, 2001.
T.-W. Lee, M. S. Lewicki, M. Girolami, and T. J. Sejnowski, “Blind source separation of more sources than mixtures using overcomplete representations,” IEEE Signal Processing Letters, vol. 4, no. 4, Apr. 1999.
M. Zibulevsky, B. A. Pearlmutter, P. Bofill, and P. Kisilev, “Blind source sepa-ration by sparse decomposition,” in Independent Component Analysis: Princi-ples and Practice, S. J. Roberts and R. M. Everson, Eds. Cambridge University Press, 2001.
M. Davies and N. Mitianoudis, “A simple mixture model for sparse overcom-plete ICA,” IEE Proceedings on Vision, Image and Signal Processing, Feb. 2004.
A. Jourjine, S. Rickard, and O. Yilmaz, “Blind separation of disjoint orthogonal signals: Demixing n sources from 2 mixtures,” in Proc. ICASSP, vol. 5, Istanbul, Turkey, June 2000, pp. 2985-2988.
B. D. Rao, K. Engan, S. F. Cotter, J. Palmer, and K. Kreutz-Delgado, “Subset selection in noise based on diversity measure minimization,” IEEE Trans. Sig-nal Processing, vol. 51, no. 3, pp. 760-770, Mar. 2003.
S. Mallat, A Wavelet Tour of Signal Processing. Academic Press, 1998.
S. Chen, D. Donoho, and M. Saunders, “Atomic decomposition by basis pur-suit,” SIAM Journal on Scientific Computing, vol. 20, no. 1, pp. 33-61, 1998.
D. F. Andrews and C. L. Mallows, “Scale mixtures of normal distributions,” J. R. Statist. Soc. Series B, vol. B, no. 36, pp. 99-102, 1974.
H. Snoussi and J. Idier, “Bayesian blind separation of generalized hyper-bolic processes in noisy and underdeterminate mixtures,” IEEE Trans. Signal Processing, vol. 54, no. 9, pp. 3257-3269, Sept. 2006.
S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. PAMI-6, no. 6, pp. 721-741, Nov. 1984.
W. R. Gilks, S. Richardson, and D. J. Spiegelhalter, Markov Chain Monte Carlo in Practice. Chapman & Hall, 1996.
J. S. Liu, “The collapsed Gibbs sampler with applications to a gene regulation problem,” J. Amer. Statist. Assoc., vol. 89, no. 427, pp. 958-966, Sept. 1994.
J. S. Liu, W. H. Wong, and A. Kong, “Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes,” Biometrika, vol. 81, no. 1, pp. 27-40, Mar. 1994.
C. Févotte and S. Godsill, “A Bayesian approach to blind separation of sparse sources,” IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 6, pp. 2174-2188, Nov. 2006.
J. Geweke, Variable Selection and Model Comparison in Regression, 5th ed. Oxford Press, 1996, pp. 609-620, edited by J. M. Bernardo, J. O. Berger, A. P. Dawid, and A. F. M. Swith.
P. J. Wolfe, S. J. Godsill, and W.-J. Ng, “Bayesian variable selection and regu-larisation for time-frequency surface estimation,” J. R. Statist. Soc. Series B, 2004.
C. Févotte and S. Godsill, “Sparse linear regression in unions of bases via Bayesian variable selection,” IEEE Signal Processing Letters, vol. 13, no. 7, pp. 441-444, July 2006.
K. Brandenburg, “MP3 and AAC explained,” in Proc. AES 17th Int. Conf. High Quality Audio Coding, Florence, Italy, Sept. 1999.
L. Daudet and M. Sandler, “MDCT analysis of sinusoids: exact results and applications to coding artifacts reduction,” IEEE Trans. Speech and Audio Processing, vol. 12, no. 3, pp. 302-312, May 2004.
M. Davy, S. Godsill, and J. Idier, “Bayesian Analysis of Polyphonic Western Tonal Music,” Journal of the Acoustical Society of America, vol. 119, no. 4, pp. 2498-2517, Apr. 2006.
C. Févotte, B. Torrésani, L. Daudet, and S. J. Godsill, “Sparse linear regression with structured priors and application to denoising of musical audio,” IEEE Transactions on Audio, Speech and Language, in press.
C. Févotte, “Bayesian blind separation of audio mixtures with structured priors,” in Proc. 14th European Signal Processing Conference (EUSIPCO’06), Florence, Italy, Sep. 2006.
E. Vincent, R. Gribonval, C. Févotte, et al., “BASS-dB: the blind audio source separation evaluation database,” Available on-line, http://www.irisa. fr/metiss/BASS-dB/.
E. Vincent, R. Gribonval, and C. Févotte, “Performance measurement in blind audio source separation,” IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 4, pp. 1462-1469, July 2006.
http://www.tsi.enst.fr/~fevotte/Samples/book blind speech separation/.
L. Daudet and B. Torrésani, “Hybrid representations for audiophonic signal encoding,” Signal Processing, vol. 82, no. 11, pp. 1595-1617, 2002, special issue on Image and Video Coding Beyond Standards.
S. Moussaoui, D. Brie, A. Mohammad-Djafari, and C. Carteret, “Separation of non-negative mixture of non-negative sources using a Bayesian approach and MCMC sampling,” IEEE Trans. Signal Processing, vol. 54, no. 11, pp. 4133-4145, Nov. 2006.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer
About this chapter
Cite this chapter
Févotte, C. (2007). Bayesian Audio Source Separation. In: Makino, S., Sawada, H., Lee, TW. (eds) Blind Speech Separation. Signals and Communication Technology. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6479-1_11
Download citation
DOI: https://doi.org/10.1007/978-1-4020-6479-1_11
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-6478-4
Online ISBN: 978-1-4020-6479-1
eBook Packages: EngineeringEngineering (R0)