Spectral analysis of bone-conducted speech using modified linear prediction

Ohidujjaman ORCID: orcid.org/0000-0002-8776-7145¹,
Mahmudul Hasan²,
Shiming Zhang³,
Mohammad Nurul Huda⁴ &
…
Mohammad Shorif Uddin^5,6

99 Accesses
Explore all metrics

Abstract

This paper improves the performance of linear prediction (LP) in precise spectral estimation of bone-conducted (BC) speech. Inherently, BC speech contains a wide spectral dynamic range that causes ill conditioning in the autocorrelation (ACR) method and its variants, where the Levinson–Durbin (L–D) algorithm is commonly implemented. Instead of the conventional LP-based spectral estimation methods, we utilize the covariance-based method, specifically the modified covariance (MC) method, where the orthogonal decomposition algorithm is deployed. In this paper, we derive the MC method from the least squares (LS) technique for BC speech analysis. The MC method reduces the eigenvalue expansion that compresses the spectral dynamic range of the BC speech signal. The effect of spectral dynamic range compression declines the ill-conditioned properties of LP. Through the proposed method using synthetic BC speech, the resulting power spectrum provides more accurate peaks than the conventional methods. The validity of the proposed method is also analyzed by inspecting real BC speech. This study reveals the utmost use of BC speech in speech processing systems. The experimental results demonstrate that the proposed method provides more accurate spectral estimation for synthetic and real BC speeches compared with conventional spectral estimation methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Ill-condition enhancement for BC speech using RMC method

Article 19 October 2024

Bone Conducted Speech Signal Enhancement Using LPC and MFCC

Musical-Noise-Free Blind Speech Extraction Based on Higher-Order Statistics Analysis

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data availability

Data will be made available on request.

Materials availability

Materials will be made available on request.

Code availability

This is private. But we can share the idea on request.

References

Amino, K., Osanai, T., Kamada, T., Makinae, H., & Arai, T. (2011). Effects of the phonological contents and transmission channels on forensic speaker recognition. In A. Neustein & H. A. Patil (Eds.), Forensic speaker recognition: Law enforcement and counter-terrorism (pp. 275–308). Springer.
Google Scholar
Cuevas, A., Lopez, S., Mandic, D., & Tobar, F. (2021). Bayesian autoregressive spectral estimation. In Proceedings of the IEEE Latin American conference on computational intelligence (LA-CCI), 2–4 November 2021, Temuco, Chile.
Gowda, D., Airaksinen, M., & Alku, P. (2017). Quasi-closed phase forward-backward linear prediction analysis of speech for accurate formant detection and estimation. The Journal of the Acoustical Society of America, 142(3), 1542–1553.
Article Google Scholar
Haykin, S. (2002). Adaptive filter theory. Prentice-Hall.
Google Scholar
Kabal, P. (2003). Ill-conditioning and bandwidth expansion in linear prediction of speech. In Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP), (pp. 824–827), 6–10 April 2003, Hong Kong, China.
Kay, S. M. (1988). Modern spectral estimation: Theory and application. Prentice-Hall.
Google Scholar
Kay, S. M., & Marple, L. (1979). Sources of and remedies for spectral line splitting in autoregressive spectrum analysis. In Proceedings of the IEEE international conference on acoustics, speech, signal processing, (pp. 151–154), 2–4 April 1979, Washington, DC, USA.
Kay, S. M., & Marple, S. L. (1981). Spectrum analysis—A modern perspective. Proceedings of the IEEE, 69(11), 1380–1419.
Article Google Scholar
Makhoul, J. (1973). Spectral analysis of speech by linear prediction. IEEE Transactions on Audio and Electroacoustics, 1973(21), 140–148.
Article Google Scholar
Makhoul, J. (1975). Linear prediction: A tutorial review. Proceedings of the IEEE, 63, 561–580.
Article Google Scholar
Makhoul, J., & Wolf, J. J. (1972). Linear prediction and the spectral analysis of speech. Technical report, 2304. Bolt, Beranek, and Newman Inc.
Marple, L. (1980). A new autoregressive spectrum analysis algorithm. IEEE Transactions on Acoustics, Speech, and Signal Processing, 28, 441–454.
Article Google Scholar
Marple, S. L. (1987). Digital spectral analysis with applications. Prentice-Hall.
Google Scholar
Marple, S. L. (1989). A tutorial overview of modern spectral estimation. In Proceedings of the international conference on acoustics, speech, and signal processing, (Vol. 4, pp. 2152–2157), 23–26 May 1989, Glasgow, UK.
Marple, S. L. (1991). A fast computational algorithm for the modified covariance method of linear prediction. Digital Signal Processing, 1(3), 124–133.
Article Google Scholar
Mataracıoglu, T., & Tatar, U. (2007). Spectral estimation methods: Comparison and performance analysis on a steganalysis application. In Proceedings of the 2nd international information security and cryptology conference, 2–3 December 2007, Ankara, Turkey.
Nikias, C. L., & Scott, P. D. (1981). Improved spectral resolution by energy-weighted prediction method. In Proceedings of the IEEE international conference on acoustics, speech, and signal processing, (Vol. 2, pp. 496–499), 30 March–1 April 1981, Atlanta, GA, USA.
Nikias, C. L., & Scott, P. D. (1983). The covariance least-squares algorithm for spectral estimation of processes of short data length. IEEE Transactions on Geoscience and Remote Sensing, GE–21(2), 180–190.
Article Google Scholar
Ohidujjaman, Sugiura, Y., Yasui, N., Shimamura, T., & Makinae, H. (2024). Regularized modified covariance method for spectral analysis of bone-conducted speech. Journal of Signal Processing,28(3), 77–87.
Ohidujjaman, Yasui, N., Sugiura, Y., Shimamura, T., & Makinae, H. (2023). Packet loss compensation for VoIP through bone-conducted speech using modified linear prediction. IEEJ Transactions on Electrical and Electronic Engineering (TEEE), 18(11), 1781–1790.
O’Shaughnessy, D. (2023). Review of analysis methods for speech applications. Speech Communication, 151, 64–75.
Article Google Scholar
Paliwal, K. K., & Rao, P. V. S. (1981). A modified autocorrelation method of linear prediction for pitch-synchronous analysis of voiced speech. Signal Processing, 3(2), 181–185.
Article Google Scholar
Rabiner, L. R., & Schafer, R. W. (2011). Theory and application of digital speech processing. Prentice-Hall.
Google Scholar
Rahman, M. S., & Shimamura, T. (2016). Pitch determination from bone conducted speech. IEICE Transactions on Information and Systems, E99–D(1), 283–287.
Article Google Scholar
Rahman, M. A., & Shimamura, T. (2018). Linear-prediction-based accurate spectrum estimation with pitch extension for bone-conducted speech. Journal of Signal Processing, 22(6), 277–286.
Article Google Scholar
Rahman, M. A., Sugiura, Y., & Shimamura, T. (2017a). Spectrum compensation method for speech signals based on prediction error filtering. WSEAS Transactions on Systems and Control, 12, 213–220.
Rahman, M. A., Sugiura, Y., & Shimamura, T. (2017b). Accurate power spectrum estimation of speech with spectrum compensation based on prediction error filtering. WSEAS Transactions on Signal Processing, 13, 21–25.
Scott, P. D., & Nikias, C. L. (1982). Energy-weighted linear predictive spectral estimation: A new method combining robustness and high resolution. Transactions on Acoustics, Speech, and Signal Processing, 30(2), 287–293.
Article Google Scholar
Spanias, A. S. (1993). Block time and frequency domain modified covariance algorithms for spectral analysis. IEEE Transactions on Signal Processing, 41(11), 3138–3152.
Article Google Scholar
Stoica, P., & Moses, R. (2005). Spectral analysis of signals. Prentice-Hall.
Google Scholar
Tohkura, Y., Itakura, F., & Hashimoto, S. (1978). Spectral smoothing technique in PARCOR speech analysis-synthesis. IEEE Transactions on Acoustics, Speech, and Signal Processing, 26(6), 587–596.
Article Google Scholar
Tsuge, S., & Kuroiwa, S. (2016). Bone- and air-conduction speech combination method for speaker recognition. International Journal of Biometrics, 11(1), 35–49.
Article Google Scholar
Ulrych, T. J., & Bishop, T. N. (1975). Maximum entropy spectral analysis and autoregressive decomposition. Review of Geophysics and Space Physics, 13(1), 183–200.
Article Google Scholar
Vaseghi, S. V. (2009). Advanced digital signal processing and noise reduction. Wiley.
Google Scholar
Welborn, M. L. (1995). Co-channel interference rejection using a model-based demodulator. Masters Theses. Virginia Tech., Blacksburg, VA, USA.
Zhang, S., Sugiura, Y., & Shimamura, T. (2022). Bone-conducted speech synthesis based on least squares method. IEEJ Transactions on Electrical and Electronic Engineering, 17(3), 425–435.
Article Google Scholar

Download references

Acknowledgements

We sincerely express our gratitude to United International University (UIU) for support in making this research happen. This research was funded by the Institute for Advanced Research Publication Grant of United International University, Ref. No.: IAR-2024-Pub-054.

Funding

This research was funded by the Institute for Advanced Research Publication Grant of United International University, Ref. No.: IAR-2024-Pub-054.

Author information

Authors and Affiliations

Computer Science and Engineering, Daffodil International University, Dhaka, 1216, Bangladesh
Ohidujjaman
Computer Science and Engineering, Comilla University, Comilla, 3506, Bangladesh
Mahmudul Hasan
School of Electrical and Information, Northeast Agricultural University, Harbin, 150030, China
Shiming Zhang
Computer Science and Engineering, United International University, Dhaka, 1212, Bangladesh
Mohammad Nurul Huda
Computer Science and Engineering, Green University of Bangladesh, Kanchon, 1460, Bangladesh
Mohammad Shorif Uddin
Computer Science and Engineering, Jahangirnagar University, Savar, 1342, Bangladesh
Mohammad Shorif Uddin

Authors

Ohidujjaman
View author publications
You can also search for this author in PubMed Google Scholar
Mahmudul Hasan
View author publications
You can also search for this author in PubMed Google Scholar
Shiming Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Nurul Huda
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Shorif Uddin
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Ohidujjaman: writing—original draft, writing—review & editing, conceptualization, formal analysis, data curation. Mahmudul Hasan: conceptualization, formal analysis, data curation. Shiming Zhang: writing—review & editing, data curation. Mohammad Nurul Huda: conceptualization, methodology. Mohammad Shorif Uddin: writing—review & editing, visualization, supervision.

Corresponding authors

Correspondence to Ohidujjaman or Mahmudul Hasan.

Ethics declarations

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Consent for publication

We affirm that this manuscript is unique, unpublished, and not under consideration for publication elsewhere. We confirm that the manuscript has been read and approved by all named authors and that there are no other people who meet the criteria for authorship but are not listed. We further affirm that we have all approved the order of authors listed in the manuscript. We understand that the corresponding author is the sole contact for the editorial process. He is responsible for communicating with the other authors about progress, submissions of revisions, and final approval of proofs.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ohidujjaman, Hasan, M., Zhang, S. et al. Spectral analysis of bone-conducted speech using modified linear prediction. Int J Speech Technol 27, 1039–1053 (2024). https://doi.org/10.1007/s10772-024-10151-3

Download citation

Received: 01 June 2024
Accepted: 13 September 2024
Published: 16 October 2024
Issue Date: December 2024
DOI: https://doi.org/10.1007/s10772-024-10151-3

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Ill-condition enhancement for BC speech using RMC method

Bone Conducted Speech Signal Enhancement Using LPC and MFCC

Musical-Noise-Free Blind Speech Extraction Based on Higher-Order Statistics Analysis

Data availability

Materials availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Spectral analysis of bone-conducted speech using modified linear prediction

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Ill-condition enhancement for BC speech using RMC method

Bone Conducted Speech Signal Enhancement Using LPC and MFCC

Musical-Noise-Free Blind Speech Extraction Based on Higher-Order Statistics Analysis

Explore related subjects

Data availability

Materials availability

Code availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Consent for publication

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now