An Adaptive Non Reference Anchor Array Framework for Audio Retrieval in Teleconferencing Environment

Karan Nathwani¹,
Arpit Shukla¹,
Shubham Khunteta¹ &
…
Rajesh M. Hegde¹

311 Accesses
Explore all metrics

Abstract

In this paper, an adaptive framework for audio retrieval in live teleconferencing environments with multiple participants is proposed. The framework uses a non reference anchor array (NRA) to capture the interfering speech sources, in addition to the primary array that captures the speech source of interest (SOI). A linearly constrained-minimum variance (LC-MV) beamformer is used herein such that the signal coming from the look direction is preserved while interferences coming from the non look direction are nulled. Additionally, the reverberant component of the speech acquired by this framework is removed by a novel method that uses the linear prediction (LP) residual cepstrum. This method does not require the computation of the acoustic impulse response (AIR) of the teleconferencing room and hence is computationally efficient. The NRA framework is therefore able to remove correlated noise coming from the direction of the SOI and also dereverberating the noise free signal. The performance of the proposed framework is evaluated by conducting experiments on clean speech acquisition from distant microphone arrays. Experiments on distant speech recognition are also conducted using the TIMIT and MONC databases. Experimental results obtained from the proposed framework indicate a reasonable improvement over correlation, subspace and standard minimum variance beamforming methods. The application of the framework in audio retrieval in a live teleconferencing environment with multiple participants is also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multichannel speaker interference reduction using frequency domain adaptive filtering

Article Open access 04 November 2020

A Novel Nested Circular Microphone Array and Subband Processing-Based System for Counting and DOA Estimation of Multiple Simultaneous Speakers

Article 19 May 2015

Directional Clustering with Polyharmonic Phase Estimation for Enhanced Speaker Localization

References

Li, J., & Stoica, P. (2006). Robust adaptive beamforming. Wiley Online Library.
Benesty, J., Chen, J., Huang, Y.(2008). Microphone array signal processing (Vol. 1). Berlin Heidelberg: Springer-Verlag.
Google Scholar
Li, J., Stoica, P., Wang, Z. (2003). On robust capon beamforming and diagonal loading. IEEE Transactions on Signal Processing , 51(7), 1702–1715.
Article Google Scholar
Shukla, A., Nathwani, K., Hegde, R.M. (2012). An adaptive non reference anchor array framework for distant speech recognition. In Advances in multimedia information processing–PCM 2012 (pp. 222–231). Berlin Heidelberg: Springer-Verlag.
Chapter Google Scholar
Nathwani, K., & Hegde, R. (2012). Joint adaptive beamforming and echo cancellation using a non reference anchor array framework. In TA8a1-10: array signal processing, 46th asilomar conference on signals, systems and computers Nov. 2012. Pacific Grove, California.
Bees, D., Blostein, M., Kabal, P. (1991). Reverberant speech enhancement using cepstral processing. In Acoustics, speech, and signal processing, 1991. ICASSP-91., International conference on (pp. 977–980). IEEE.
Dobrowolski, A.P., & Majda E. (2011). Cepstral analysis in the speakers recognition systems. In Signal processing algorithms, architectures, arrangements, and applications conference proceedings (SPA), 2011 (pp. 1–6). IEEE.
Mosayyebpour, S., Sayyadiyan, A., Zareian, M., Shahbazi, A. (2010). Single channel inverse filtering of room impulse response by maximizing skewness of lp residual. In Signal acquisition and processing, 2010. ICSAP’10. International conference on (pp. 130–134). IEEE.
Xizhong, S., & Guang, M. (2009). Complex cepstrum based singlechannel speech dereverberation. In Computer science & education, 2009. ICCSE’09. 4th International conference on (pp. 7–11). IEEE.
Dmochowski, J., Benesty, J., Affès, S. (2009). On spatial aliasing in microphone arrays. Signal Processing, IEEE Transactions on, 57(4), 1383–1395.
Article Google Scholar
Naylor, P.A., & Gaubitch, N.D. (2010). Speech dereverberation. Springer.
Garofolo, J. (1993). TIMIT: acoustic-phonetic continuous speech corpus. Philadelphia: Linguistic Data Consortium.
Google Scholar
Levi, A. (2003). Multi channel overlapping numbers corpus distribution. Philadelphia: Linguistic Data Consortium. http://cslu.cse.ogi.edu/corpora/.
Google Scholar
Loizou, P. (2011). Speech quality assessment. Multimedia analysis, processing and communications (pp. 623–654).
Naylor, P., & Gaubitch, N. (2012). Acoustic signal processing in noise: its not getting any quieter. In Acoustic signal enhancement; proceedings of IWAENC 2012, International workshop on (pp. 1–6). VDE.
Qin, B., Zhang, H., Fu, Q., Yan, Y. (2008). Subsample time delay estimation via improved gcc phat algorithm. In Signal processing, 2008. ICSP 2008. 9th international conference on (pp. 2579–2582).
Zahernia, A., Dehghani, M., Javidan, R. (2011). Music algorithm for doa estimation using mimo arrays. In 6th telecommunication systems services, and applications (TSSA), 2011 international conference on (pp. 149–153).
Huber, R. (2006). PEMO-Q–A new method for objective audio quality assessment using a model of auditory perception. IEEE Transactions on Audio Speech and Language Processing , 14(6), 1902–1911.
Article Google Scholar
Qadeer, M. (2012). Dynamic call transfer through wi-fi networks using asterisk. In Proceedings of the international conference on soft computing for problem solving (SocProS 2011) December 20-22, 2011 (pp. 51–61). New York: Springer.
Chapter Google Scholar
Sinnreich, H., & Johnston, A. B. (2012). Internet communications using SIP: delivering VoIP and multimedia services with session initiation protocol (Vol. 27). Indianapolis: Wiley Publishing, Inc.

Download references

Acknowledgments

This work was supported in part by the DeITY, Goverment of India and in part by the BSNL Telecom Center of Excellence, IIT Kanpur

Author information

Authors and Affiliations

Department of Electrical Engineering, Indian Institute of Technology, Kanpur, 16, India
Karan Nathwani, Arpit Shukla, Shubham Khunteta & Rajesh M. Hegde

Authors

Karan Nathwani
View author publications
You can also search for this author in PubMed Google Scholar
Arpit Shukla
View author publications
You can also search for this author in PubMed Google Scholar
Shubham Khunteta
View author publications
You can also search for this author in PubMed Google Scholar
Rajesh M. Hegde
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rajesh M. Hegde.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Nathwani, K., Shukla, A., Khunteta, S. et al. An Adaptive Non Reference Anchor Array Framework for Audio Retrieval in Teleconferencing Environment. J Sign Process Syst 74, 91–102 (2014). https://doi.org/10.1007/s11265-013-0786-7

Download citation

Received: 15 January 2013
Revised: 18 May 2013
Accepted: 21 May 2013
Published: 19 June 2013
Issue Date: January 2014
DOI: https://doi.org/10.1007/s11265-013-0786-7

An Adaptive Non Reference Anchor Array Framework for Audio Retrieval in Teleconferencing Environment

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multichannel speaker interference reduction using frequency domain adaptive filtering

A Novel Nested Circular Microphone Array and Subband Processing-Based System for Counting and DOA Estimation of Multiple Simultaneous Speakers

Directional Clustering with Polyharmonic Phase Estimation for Enhanced Speaker Localization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An Adaptive Non Reference Anchor Array Framework for Audio Retrieval in Teleconferencing Environment

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multichannel speaker interference reduction using frequency domain adaptive filtering

A Novel Nested Circular Microphone Array and Subband Processing-Based System for Counting and DOA Estimation of Multiple Simultaneous Speakers

Directional Clustering with Polyharmonic Phase Estimation for Enhanced Speaker Localization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation