Abstract
With the ever increasing volume of digital data, using data deduplication techniques in archival storage has become imperative in large scale cloud systems. In a storage system based on a cross-user data deduplication technique, data is uploaded to the server by the client if only it has not been uploaded by him or any other clients previously. Therefore, such storage systems achieve optimal utilization of both storage and bandwidth resources. Although the existing cross-user data deduplication techniques preserve data privacy requirements to some extent (e.g. confidentiality and tag consistency), they leak information to communication parties. In particular in those settings, the clients and the server may find out which clients own identical data. Nevertheless, in this paper we introduce a privacy requirement for data deduplication techniques called data-unlinkability to resolve this problem. Moreover, we propose a novel approach for data deduplication using Bloom filter to provide data-unlinkability for clients’ files. Our proposed approach does not use Proof of Ownership (POW) scheme, and only entities who have access to the entire files content can obtain the information for files retrieval. The security and performance analysis suggest that our technique provides both privacy requirements and optimization of storage space and bandwidth.
Similar content being viewed by others
References
Gantz J, Reinsel D (2012) The digital universe in 2020: Big data, bigger digital shadows, and biggest growth in the far east. http://www.emc.com/collateral/analyst-reports/idc-the-digital-universe-in-2020.pdf
https://www.uk.insight.com/en-gb/shop/acronis-in-cq/deduplication-in-cq
Su K, Leu JS, Yu MC, Wu YT, Lee EC, Song T (2017) Design and implementation of various file deduplication schemes on storage devices. Mobile Networks and Applications 22(1):40–50
Douceur JR, Adya A, Bolosky WJ, Simon D, Theimer M (2002) Reclaiming space from duplicate files in a serverless distributed file system. In: Proc of the 22nd international conference on distributed computing systems (ICDCS’02), pp 617–624
Puzio P, Molva R, Onen M, Loureiro S (2013) ClouDedup: secure deduplication with encrypted data for cloud storage. In: Proc of the IEEE 5th international conference on cloud computing technology and science (CloudCom’13), pp 363–370
Li J, Chen X, Li M, Li J, Lee PPC, Lou W (2014) Secure deduplication with efficient and reliable convergent key management. IEEE Trans Parallel Distrib Syst 25(6):1615–1625
Chen R, Mu Y, Yang G, Guo F (2015) BL-MLE: block-level message-locked encryption for secure large file deduplication. IEEE Trans Inf Forensics Secur 10(12):2643–2652
Gonzalez-Manzano L, Orfila A (2015) An efficient confidentiality-preserving proof of ownership for deduplication. J Netw Comput Appl 50:49–59
Bellare M, Keelveedhi S, Ristenpart T (2013) Dupless: server aided encryption for deduplicated storage. In: Proc of the 22nd USENIX conference on security (SEC’13), pp 179–194
Li J, Li YK, Chen X, Lee PPC, Lou W (2015) A hybrid cloud approach for secure authorized deduplication. IEEE Trans Parallel Distrib Syst 26(5):1206–1216
Zhou Y, Feng D, Xia W, Fu M, Huang F, Zhang Y, Li C (2015) SecDep: a user-aware efficient fine-grained secure deduplication scheme with multi-level key management. In: Proc of the 31st symposium on mass storage systems and technologies (MSST’15), pp 1–14
Kwon H, Hahn C, Kim D, Hur J (2015) Secure deduplication for multimedia data with user revocation in cloud storage. Multimedia Tools and Applications, https://doi.org/10.1007/s11042-015-2595-4
Miao M, Wang J, Li H, Chen X (2015) Secure multi-server-aided data deduplication in cloud computing. Pervasive Mob Comput 24(C):129–137
Li J, Chen X, Huang X, Tang S, Xiang Y, Hassan MM, Alelaiwi A (2015) Secure distributed deduplication systems with improved reliability. IEEE Trans Comput 64(12):3569–3579
Li J, Li J, Xie D, Cai Z (2015) Secure auditing and deduplicating data in cloud. IEEE Transactions on Computers, https://doi.org/10.1109/TC.2015.2389960
Halevi S, Harnik D, Pinkas B, Shulman-Peleg A (2011) Proofs of ownership in remote storage systems. In: Proc of the 18th ACM Conference on Computer and Communications Security (CCS’11), pp 491–500
Storer MW, Greenan K, Long DDE, Miller EL (2008) Secure data deduplication
Li J, Chen X, Xhafa F, Barolli L (2015) Secure deduplication storage systems supporting keyword search. J Comput Syst Sci 81(8):1532–1541
Li X, Li J, Huang F (2016) A secure cloud storage system supporting privacy-preserving fuzzy deduplication. Soft Comput 20(4):1437–1448
Katz J, Lindell Y (2014) Introduction to modern cryptography, 2th ed Chapman & hall/CRC Cryptography and Network Security Series
Chaum D (1983) Blind signatures for untraceable payments. Advances in Cryptology. Springer, USA, pp 199–203
Bellare M, Namprempre C, Pointcheval D, Semanko P (2003) The one-more-RSA inversion problems and the security of chaum’s blind signature scheme. J Cryptol 16(3):185–215
ElGamal T (1985) A public key cryptosystem and a signature scheme based on discrete logarithms. IEEE Trans Inf Theory 31(4):473–481
Bloom BH (1970) Space/time trade-offs in hash coding with allowable errors. Commun ACM 13:422–426
Bellare M, Keelveedhi S, Ristenpart T (2013) Message-locked encryption and secure deduplication. In: Proc of the advances in cryptology (EUROCRYPT’13), vol 7881 of LNCS, pp 296–312
Paar C, Pelzl J (2009) Stream ciphers. In: Understanding cryptography, a textbook for students and practitioners, Springer
Mason AG (2001) Network, Cisco Secure Virtual Private, 1St edn, Cisco Career Certification, Pearson Education
Sweeney L (2002) K-anonymity: a model for protecting privacy. Int J Uncertainty Fuzziness Knowledge Based Syst 10(5):557–570
Merkle RC (1987) A digital signature based on a conventional encryption function. In: Proc of the advances in cryptology (CRYPTO’87), vol 293 of LNCS, pp 369–378
Rivest R, Adelman L, Dertouzos M (1978) On databanks and privacy homomorphism. Foundations of Secure Computation, 168–177
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jannati, H., Ardeshir-Larijani, E. & Bahrak, B. Privacy in Cross-User Data Deduplication. Mobile Netw Appl 26, 2567–2579 (2021). https://doi.org/10.1007/s11036-018-1100-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11036-018-1100-5