Blind signal separation with Noise Reduction for efficient speaker identification

Hossam Hammam¹,
Walid El-Shafai ORCID: orcid.org/0000-0001-7509-2120²,
Emad Hassan²,
Atef E. Abu El-Azm²,
Moawad I. Dessouky²,
Mohamed E. Elhalawany² &
…
Fathi E. Abd El-Samie^2,3

360 Accesses
4 Citations
Explore all metrics

Abstract

Generally, most blind signal separation algorithms deal with the separation problem in the absence of noise. The presence of noise degrades the performance of separated signals. This paper deals with the problem of blind separation of audio signals from noisy mixtures. Blind signal separation algorithm is applied on the discrete cosine transform, the discrete sine transform or the discrete wavelet transform of the mixed signals, instead of performing the separation on the mixtures in the time domain. All of these transforms have an energy compaction property, which concentrates most of the signal energy in a few coefficients in the transform domain, leaving most of the transform-domain coefficients close to zero. As a result, the separation is performed on a few coefficients in the transform domain. Another advantage of signal separation in transform domains is that the effect of noise on the signals in the transform domains is smaller than that in the time domain. The paper presents also an investigation of the rule of the speech enhancement techniques as pre- and post-processing steps for the blind signal separation process, instead of performing the separation on the mixtures in the time domain. The considered speech enhancement techniques are the spectral subtraction, the Wiener filtering, the adaptive Wiener filtering, and the wavelet denoising techniques. Both blind signal separation and noise reduction are applied within a real speaker identification system to reduce the effect of interference and noise on the system performance. The simulation results confirm the superiority of transform domain separation to time domain separation and the importance of the wavelet denoising technique, when used as a pre-processing step for noise reduction. Moreover, the speaker identification system performance is enhanced with blind signal separation and noise reduction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Maximum A Posteriori Spectral Estimation with Source Log-Spectral Priors for Multichannel Speech Enhancement

Development of a speech separation system using frequency domain blind source separation technique

Article 23 September 2023

Speaker recognition based on pre-processing approaches

Article 19 March 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Abd El-Fattah, M. A., Dessouky, M. I., Diab, S. M., & Abd El-Samie, F. E. (2008). Speech enhancement using an adaptive wiener filtering approach. Progress in Electromagnetics Research M, 4, 167–184.
Article Google Scholar
Beerends, J. G., Buuren, R. V., Vugt, J. V., & Verhave, J. (2009). Objective speech intelligibility measurement on the basis of natural speech in combination with perceptual modeling. Journal of the Audio Engineering Society, 57(5), 299–308.
Google Scholar
Chan, D. C. (1997). Blind signal separation. A PhD dissertation. University of Cambridge.
Curnew, S. R., & How, J. (2007). Blind signal separation in MIMO OFDM systems using ICA and fractional sampling. In International symposium on signals, systems and electronics (pp. 67–70). ISSSE ‘07.
Dam, H. H., Nordholm, S., Low, S. Y., & Cantoni, A. (2007). Blind signal separation using steepest descent method. IEEE Trans Signal Processing, 55(8), 4198–4207.
Article MathSciNet Google Scholar
Debals, O., Van Barel, M., & De Lathauwer, L. (2016). Löwner-based blind signal separation of rational functions with applications. IEEE Transactions on Signal Processing, 64(8), 1909–1918.
Article MathSciNet Google Scholar
Deller, J. R., Hansen, J. H. L., & Proakis, J. G. (2000). Discrete-time processing of speech signals (2nd ed.). New York: IEEE Press.
Google Scholar
Grimaldi, M., & Cummins, F. (2008). Speaker identification using instantaneous frequencies. IEEE Transactions on audio, Speech, and Language Processing, 16(6), 1097–1111.
Article Google Scholar
Gupta, V. K., Chandra, M., & Sharan, S. N. (2013). Acoustic echo and noise cancellation system for hand-free telecommunication using variable step size algorithms. Radioengineering, 22(1), 200–207.
Google Scholar
Hayati, M., Shirvany, Y. (2007). Artificial neural network approach for short term load forecasting for Illam Region. In Processing of world academy of science, engineering and technology (Vol. 22).
Huang, P. S., Kim, M., Hasegawa-Johnson, M., & Smaragdis, P. (2014). Deep learning for monaural speech separation. In 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 1562–1566). IEEE.‏
Jensen, J., & Hansen, J. H. (2001). Speech enhancement using a constrained iterative sinusoidal model. IEEE Transactions on Speech and Audio Processing, 9(7), 731–740.
Article Google Scholar
Keighrey, C., Flynn, R., Murray, S., & Murray, N. (2017). A QoE evaluation of immersive augmented and virtual reality speech & language assessment applications. In 2017 ninth international conference on quality of multimedia experience (QoMEX) (pp. 1–6). IEEE.‏
Kinnunen, T., & Li, H. (2010). An overview of text-independent speaker recognition: From features to supervectors. Speech Communication, 52(1), 12–40.
Article Google Scholar
Kleijn, W. B., Lim, F. S., Luebs, A., Skoglund, J., Stimberg, F., Wang, Q., & Walters, T. C. (2018). Wavenet based low rate speech coding. In 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 676–680). IEEE.‏
Kozlowski, S. W. (2015). Advancing research on team process dynamics: Theoretical, methodological, and measurement considerations. Organizational Psychology Review, 5(4), 270–299.
Article Google Scholar
Manmontri, U., & Naylor, P. A. (2008). A class of frobenius norm-based algorithms using penalty term and natural gradient for blind signal separation. IEEE Transactions on Audio, Speech, and Language Processing, 16(6), 1181–1193.
Article Google Scholar
Moreau, E., Pesquet, J. C., & Thirion-Moreau, N. (2007). Convolutive blind signal separation based on asymmetrical contrast functions. IEEE Transactions on Signal Processing, 55(1), 356–371.
Article MathSciNet Google Scholar
Pillai, S., Madhavan. (2006). Robust speaker identification using artificial neural network. Dissertation of Master degree in computer science, University of Nevada, Las Vegas.
Prochazka, A., Uhlir, J., Rayner, P. J. W., & Kingsbury, N. J. (1998). Signal analysis and prediction. Switzerland: Birkhauser Inc.
Book Google Scholar
Pullella, D. (2006). Speaker identification using higher order spectra. Dissertation of Bachelor of Electrical and Electronic Engineering, University of Western Australia.
Ramakrishnan, A. G., Abhiram, B., & Mahadeva Prasanna, S. R. (2015). Voice source characterization using pitch synchronous discrete cosine transform for speaker identification. The Journal of the Acoustical Society of America, 137(6), EL469–EL475.
Article Google Scholar
Rao, K. R., & Yip, P. (2014). Discrete cosine transform: Algorithms, advantages, applications. Boston: Academic Press.
MATH Google Scholar
Sadhu, A., Narasimhan, S., & Antoni, J. (2017). A review of output-only structural mode identification literature employing blind source separation methods. Mechanical Systems and Signal Processing, 94, 415–431.
Article Google Scholar
Unser, M., & Van De Ville, D. (2008). The pairing of a wavelet basis with a mildly redundant analysis via subband regression. IEEE Transactions on Image Processing, 17(11), 2040–2052.
Article MathSciNet Google Scholar
Upadhyay, N., & Karmakar, A. (2015). Speech enhancement using spectral subtraction-type algorithms: A comparison and simulation study. Procedia Computer Science, 54, 574–584.
Article Google Scholar
Valentini-Botinhao, C., Wu, Z., & King, S. (2015). Towards minimum perceptual error training for DNN-based speech synthesis. In Sixteenth annual conference of the international speech communication association.‏
Yang, W., Benbouchta, M., & Yantorno, R. (1998). Performance of the modified bark spectral distortion as an objective speech quality measure. In Proceedings of the IEEE international conf. on acoustic, speech and signal processing (ICASSP) (Vol. 1, pp. 541–544), Washington, USA.
Zheng-you, H., Xiaoqing, C., & Guoming, L. (2006). Wavelet entropy measure definition and its application for transmission line fault detection and identification; (Part I: Definition and methodology). In International conference on power system technology (pp. 1–6). PowerCon 2006.

Download references

Author information

Authors and Affiliations

Telecom Egypt, Alexandria, Egypt
Hossam Hammam
Department of Electronics and Electrical Communications, Faculty of Electronic Engineering, Menoufia University, Menouf, 32952, Egypt
Walid El-Shafai, Emad Hassan, Atef E. Abu El-Azm, Moawad I. Dessouky, Mohamed E. Elhalawany & Fathi E. Abd El-Samie
Department of Information Technology, College of Computer and Information Sciences, Princess NourahBint Abdulrahman University, Riyadh, 21974, Saudi Arabia
Fathi E. Abd El-Samie

Authors

Hossam Hammam
View author publications
You can also search for this author in PubMed Google Scholar
Walid El-Shafai
View author publications
You can also search for this author in PubMed Google Scholar
Emad Hassan
View author publications
You can also search for this author in PubMed Google Scholar
Atef E. Abu El-Azm
View author publications
You can also search for this author in PubMed Google Scholar
Moawad I. Dessouky
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed E. Elhalawany
View author publications
You can also search for this author in PubMed Google Scholar
Fathi E. Abd El-Samie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Walid El-Shafai.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hammam, H., El-Shafai, W., Hassan, E. et al. Blind signal separation with Noise Reduction for efficient speaker identification. Int J Speech Technol 24, 235–250 (2021). https://doi.org/10.1007/s10772-019-09641-6

Download citation

Received: 05 March 2019
Accepted: 26 September 2019
Published: 16 January 2021
Issue Date: March 2021
DOI: https://doi.org/10.1007/s10772-019-09641-6

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Maximum A Posteriori Spectral Estimation with Source Log-Spectral Priors for Multichannel Speech Enhancement

Development of a speech separation system using frequency domain blind source separation technique

Speaker recognition based on pre-processing approaches

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Blind signal separation with Noise Reduction for efficient speaker identification

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Maximum A Posteriori Spectral Estimation with Source Log-Spectral Priors for Multichannel Speech Enhancement

Development of a speech separation system using frequency domain blind source separation technique

Speaker recognition based on pre-processing approaches

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation