Reversible Speech De-identification Using Parametric Transformations and Watermarking

Aitor Valdivielso²¹,
Daniel Erro^21,22 &
Inma Hernaez²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10077))

Included in the following conference series:

International Conference on Advances in Speech and Language Technologies for Iberian Languages

707 Accesses
1 Citations

Abstract

This paper presents a system capable of de-identifying speech signals in order to hide and protect the identity of the speaker. It applies a relatively simple yet effective transformation of the pitch and the frequency axis of the spectral envelope thanks to a flexible wideband harmonic model. Moreover, it inserts the parameters of the transformation in the signal by means of watermarking techniques, thus enabling re-identification. Our experiments show that for adequate modification factors its performance is satisfactory in terms of quality, de-identification degree and naturalness. The limitations due to the signal processing framework are discussed as well.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Speech Watermarking Based on Coding of the Harmonic Phase

Simultaneous speaker identification and watermarking

Article 15 January 2021

Semi-fragile digital speech watermarking for online speaker recognition

Article Open access 21 October 2015

Notes

1.
Voice conversion can be seen as a particular case of voice transformation where there is a specific target speaker.
2.
PESQ predicts the mean opinion score of a distorted signal in comparison with its original clean version.

References

Ribaric, S., Ariyaeeinia, A., Pavesic, N.: De-identification for privacy protection in multimedia content: a survey. Signal Process. Image Commun. 47, 131–151 (2016)
Article Google Scholar
Jin, Q., Toth, A.R., Schultz, T., Black, A.W.: Voice convergin: speaker de-identification by voice transformation. In: Proceedings of ICASSP, pp. 3909–3912 (2009)
Google Scholar
Pobar, M., Ipsic, I.: Online speaker de-identification using voice transformation. In: Proceedings of MIPRO, pp. 1264–1267 (2014)
Google Scholar
Justin, T., Struc, V., Dobrisek, S., Vesnicer, B., Ipsic, I., Mihelic, F.: Speaker de-identification using diphone recognition and speech synthesis. In: Proceedings of 11th IEEE International Conference on Automatic Face and Gesture Recognition, pp. 1–7 (2015)
Google Scholar
Magariños, C., Lopez-Otero, P., Docio, L., Erro, D., Rodriguez-Banga, E., Garcia-Mateo, C.: Piecewise linear definition of transformation functions for speaker de-identification. In: Proceedings of SPLINE (2016)
Google Scholar
Magariños, C., Lopez-Otero, P., Docio, L., Rodriguez-Banga, E., Erro, D., Garcia-Mateo, C.: Reversible speaker de-identification using pre-trained transformation functions. IEEE Signal Process. Lett. (2016, submitted)
Google Scholar
Erro, D., Moreno, A., Bonafonte, A.: Flexible harmonic/stochastic speech synthesis. In: Proceedings of 6th ISCA Speech Synthesis Workshop, pp. 194–199 (2007)
Google Scholar
Degottex, G., Stylianou, Y.: Analysis and synthesis of speech using an adaptive full-band harmonic model. IEEE Trans. Audio Speech Lang. Process. 21(10), 2085–2095 (2013)
Article Google Scholar
Stylianou, Y.: Harmonic plus noise models for speech, combined with statistical methods, for speech and speaker modification. Ph.D. thesis, ENST, Paris (1996)
Google Scholar
Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In: Proceedings of Institute of Phonetic Sciences, University of Amsterdam, pp. 97–110 (1993)
Google Scholar
Tokuda, K., Kobayashi, T., Masuko, T., Imai, S.: Mel-generalized cepstral analysis - a unified approach to speech spectral estimation. In: Proceedings of ICSLP, vol. 3, pp. 1043–1046 (1994)
Google Scholar
Nematollahi, M.A., Al-Haddad, S.A.R.: An overview of digital speech watermarking. Int. J. Speech Tech. 16(4), 471–488 (2013)
Article Google Scholar
Kirovski, D., Malvar, H.S.: Spread-spectrum watermarking of audio signals. IEEE Trans. Signal Process. 51(4), 1020–1033 (2003)
Article MathSciNet Google Scholar
Korzhik, V.I., Morales-Luna, G., Fedyanin, I.: Audio watermarking based on echo hiding with zero error probability. Int. J. Emerg. Technol. Adv. Eng. 10(1), 1–10 (2013)
Google Scholar
Hernaez, I., Saratxaga, I., Ye, J., Sanchez, J., Erro, D., Navas, E.: Speech watermarking based on coding of the harmonic phase. In: Navarro Mesa, J.L., Ortega, A., Teixeira, A., Hernández Pérez, E., Quintana Morales, P., Ravelo García, A., Guerra Moreno, I., Toledano, D.T. (eds.) IberSPEECH 2014. LNCS (LNAI), vol. 8854, pp. 259–268. Springer, Heidelberg (2014). doi:10.1007/978-3-319-13623-3_27
Google Scholar
Zeki, A.M., Manaf, A.A.: A novel digital watermarking technique based on ISB (Intermediate Significant Bit). Int. J. Comput. Electr. Autom. Control Inf. Eng. 3(2), 444–451 (2009)
Google Scholar
Moon, T.K.: Error Correction Coding: Mathematical Methods and Algorithms. Wiley, New York (2005)
Book MATH Google Scholar
Rix, A., Beerends, J., Hollier, M., Hekstra, A.: Perceptual evaluation of speech quality (PESQ) - a new method for speech quality assessment of telephone networks and codecs. In: Proceedings of ICASSP, vol. 2, pp. 749–752 (2001)
Google Scholar
Phonexia speaker identification. https://www.phonexia.com/technologies/sid
Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)
Article Google Scholar
White, L., King, S.: The EUSTACE speech corpus (2003). http://www.cstr.ed.ac.uk/projects/eustace

Download references

Acknowledgements

This work has been partially funded by the Spanish Ministry of Economy and Competitiveness (RESTORE project, TEC2015-67163-C2-1-R MINECO/FEDER,UE) and the Basque Government (ELKAROLA, KK-2015/00098).

Author information

Authors and Affiliations

Aholab, University of the Basque Country (UPV/EHU), Bilbao, Spain
Aitor Valdivielso, Daniel Erro & Inma Hernaez
IKERBASQUE, Basque Foundation for Science, Bilbao, Spain
Daniel Erro

Authors

Aitor Valdivielso
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Erro
View author publications
You can also search for this author in PubMed Google Scholar
Inma Hernaez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Daniel Erro .

Editor information

Editors and Affiliations

INESC-ID/IST, Universidade de Lisboa, Lisbon, Portugal
Alberto Abad
I3A/University of Zaragoza, Zaragoza, Spain
Alfonso Ortega
DETI/IEETA, University of Aveiro, Aveiro, Portugal
António Teixeira
AtlantTIC Research Center, Universidad de Vigo, Vigo, Spain
Carmen García Mateo
Universitat Politècnica de València, Valencia, Spain
Carlos D. Martínez Hinarejos
University of Coimbra, Coimbra, Portugal
Fernando Perdigão
INESC-ID/ISCTE-IUL, Lisbon, Portugal
Fernando Batista
INESC-ID/IST, Universidade de Lisboa, Lisbon, Portugal
Nuno Mamede

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Valdivielso, A., Erro, D., Hernaez, I. (2016). Reversible Speech De-identification Using Parametric Transformations and Watermarking. In: Abad, A., et al. Advances in Speech and Language Technologies for Iberian Languages. IberSPEECH 2016. Lecture Notes in Computer Science(), vol 10077. Springer, Cham. https://doi.org/10.1007/978-3-319-49169-1_26

Download citation

DOI: https://doi.org/10.1007/978-3-319-49169-1_26
Published: 04 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-49168-4
Online ISBN: 978-3-319-49169-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Reversible Speech De-identification Using Parametric Transformations and Watermarking

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Speech Watermarking Based on Coding of the Harmonic Phase

Simultaneous speaker identification and watermarking

Semi-fragile digital speech watermarking for online speaker recognition

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Reversible Speech De-identification Using Parametric Transformations and Watermarking

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Speech Watermarking Based on Coding of the Harmonic Phase

Simultaneous speaker identification and watermarking

Semi-fragile digital speech watermarking for online speaker recognition

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation