CAPE: Context-Aware Private Embeddings for Private Language Learning

Richard Plant, Dimitra Gkatzia, Valerio Giuffrida

Abstract

Neural language models have contributed to state-of-the-art results in a number of downstream applications including sentiment analysis, intent classification and others. However, obtaining text representations or embeddings using these models risks encoding personally identifiable information learned from language and context cues that may lead to privacy leaks. To ameliorate this issue, we propose Context-Aware Private Embeddings (CAPE), a novel approach which combines differential privacy and adversarial learning to preserve privacy during training of embeddings. Specifically, CAPE firstly applies calibrated noise through differential privacy to maintain the privacy of text representations by preserving the encoded semantic links while obscuring sensitive information. Next, CAPE employs an adversarial training regime that obscures identified private variables. Experimental results demonstrate that our proposed approach is more effective in reducing private information leakage than either single intervention, with approximately a 3% reduction in attacker performance compared to the best-performing current method.

Anthology ID:: 2021.emnlp-main.628
Original:: 2021.emnlp-main.628v1
Version 2:: 2021.emnlp-main.628v2
Volume:: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
Month:: November
Year:: 2021
Address:: Online and Punta Cana, Dominican Republic
Editors:: Marie-Francine Moens, Xuanjing Huang, Lucia Specia, Scott Wen-tau Yih
Venue:: EMNLP
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 7970–7978
Language:
URL:: https://aclanthology.org/2021.emnlp-main.628
DOI:: 10.18653/v1/2021.emnlp-main.628
Bibkey:
Cite (ACL):: Richard Plant, Dimitra Gkatzia, and Valerio Giuffrida. 2021. CAPE: Context-Aware Private Embeddings for Private Language Learning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 7970–7978, Online and Punta Cana, Dominican Republic. Association for Computational Linguistics.
Cite (Informal):: CAPE: Context-Aware Private Embeddings for Private Language Learning (Plant et al., EMNLP 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.emnlp-main.628.pdf
Video:: https://aclanthology.org/2021.emnlp-main.628.mp4
Code: additional community code

PDF (v2) PDF (v1) Cite Search Code Video