Utility Promises of Self-Organising Maps in Privacy Preserving Data Mining

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 12484))

Included in the following conference series:

Abstract

Data mining techniques are highly efficient in sifting through big data to extract hidden knowledge and assist evidence-based decisions. However, it poses severe threats to individuals’ privacy because it can be exploited to allow inferences to be made on sensitive data. Researchers have proposed several privacy-preserving data mining techniques to address this challenge. One unique method is by extending anonymisation privacy models in data mining processes to enhance privacy and utility. Several published works in this area have utilised clustering techniques to enforce anonymisation models on private data, which work by grouping the data into clusters using a quality measure and then generalise the data in each group separately to achieve an anonymisation threshold. Although they are highly efficient and practical, however guaranteeing adequate balance between data utility and privacy protection remains a challenge. In addition to this, existing approaches do not work well with high-dimensional data, since it is difficult to develop good groupings without incurring excessive information loss. Our work aims to overcome these challenges by proposing a hybrid approach, combining self organising maps with conventional privacy based clustering algorithms. The main contribution of this paper is to show that, dimensionality reduction techniques can improve the anonymisation process by incurring less information loss, thus producing a more desirable balance between privacy and utility properties.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Privacy Preserving Datamining Techniques with Data Security in Data Transformation

Towards Data Anonymization in Data Mining via Meta-heuristic Approaches

Clustering Techniques for Big Data Mining

Notes

1.
$D^k$ denotes a k-anonymised version of the original table D.
2.
where $||\cdot ||$ is the measure of distance.

References

Witten, I.H., Frank, E., Hall, M.A.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Series in Data Management Systems, 3rd edn. Morgan Kaufmann, Amsterdam (2011)
Google Scholar
Narwaria, M., Arya, S.: Privacy preserving data mining – ‘a state of the art’. In: 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom), pp. 2108–2112, March 2016
Google Scholar
Sharma, S., Shukla, D.: Efficient multi-party privacy preserving data mining for vertically partitioned data. In: 2016 International Conference on Inventive Computation Technologies (ICICT), vol. 2, pp. 1–7, August 2016
Google Scholar
Kaur, A.: A hybrid approach of privacy preserving data mining using suppression and perturbation techniques. In: 2017 International Conference on Innovative Mechanisms for Industry Applications (ICIMIA), pp. 306–311, February 2017
Google Scholar
Liu, W., Luo, S., Wang, Y., Jiang, Z.: A protocol of secure multi-party multi-data ranking and its application in privacy preserving sequential pattern mining. In: 2011 Fourth International Joint Conference on Computational Sciences and Optimization, pp. 272–275, April 2011
Google Scholar
Lin, J.-L., Wei, M.-C.: An efficient clustering method for k-anonymization. In: Proceedings of the 2008 International Workshop on Privacy and Anonymity in Information Society - PAIS 2008. ACM Press (2008)
Google Scholar
Lin, K.-P., Chen, M.-S.: On the design and analysis of the privacy-preserving SVM classifier. IEEE Trans. Knowl. Data Eng. 23(11), 1704–1717 (2011)
Article Google Scholar
Byun, J.-W., Kamra, A., Bertino, E., Li, N.: Efficient k-anonymization using clustering techniques. In: Advances in Databases: Concepts, Systems and Applications, pp. 188–200 (2007)
Google Scholar
Oliveira, S., Zaïane, O.: Privacy preserving clustering by data transformation. J. Inf. Data Manage. 1(1), 05 (2010)
Google Scholar
Kabir, E., Wang, H., Bertino, E.: Efficient systematic clustering method for k-anonymization. Acta Informatica 48(1), 51–66 (2011)
Article MathSciNet Google Scholar
Xu, X., Numao, M.: An efficient generalized clustering method for achieving k-anonymization. In: 2015 Third International Symposium on Computing and Networking (CANDAR). IEEE, December 2015
Google Scholar
Zheng, W., Wang, Z., Lv, T., Ma, Y., Jia, C.: K-anonymity algorithm based on improved clustering. In: Algorithms and Architectures for Parallel Processing, pp. 462–476 (2018)
Google Scholar
Loukides, G., Shao, J.: Clustering-based K-anonymisation algorithms. In: Wagner, R., Revell, N., Pernul, G. (eds.) DEXA 2007. LNCS, vol. 4653, pp. 761–771. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74469-6_74
Chapter Google Scholar
Ciriani, V., De Capitani, S., di Vimercati, S., Foresti, P.S.: k: anonymous data mining: a survey. In: Aggarwal, C.C., Yu, P.S. (eds.) Privacy-Preserving Data Mining, pp. 105–136. Springer, Boston (2008). https://doi.org/10.1007/978-0-387-70992-5_5
Chapter Google Scholar
Gkoulalas-Divanis, A., Loukides, G.: A survey of anonymization algorithms for electronic health records. In: Gkoulalas-Divanis, A., Loukides, G. (eds.) Medical Data Privacy Handbook, pp. 17–34. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23633-9_2
Chapter Google Scholar
Pin, L., Wen-bing, Y., Nian-sheng, C. A unified metric method of information loss in privacy preserving data publishing. In: 2010 Second International Conference on Networks Security, Wireless Communications and Trusted Computing, vol. 2, pp. 502–505, April 2010
Google Scholar
Dua, D., Graff, C.: Adult data set UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Samarati, P.: Protecting respondents identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)
Article Google Scholar
Ciriani, V., De Capitani Di Vimercati, S., Foresti, S., Samarati, P.: k-anonymity. In: Yu, T., Jajodia, S. (eds.) Secure data management in decentralized systems, pp. 323–353. Springer, Boston (2007). https://doi.org/10.1007/978-0-387-27696-0_10
Chapter Google Scholar
Samarati, P., Sweeney, L.: Generalizing data to provide anonymity when disclosing information (abstract). In: Proceedings of the Seventeenth Symposium on Principles of Database Systems. ACM Press (1998)
Google Scholar
Meyerson, A., Williams, R.: On the complexity of optimal k-anonymity. In: Proceedings of the Twenty-Third ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, PODS 2004, New York, NY, USA, pp. 223–228. Association for Computing Machinery (2004)
Google Scholar
Tripathy, B.: Database anonymization techniques with focus on uncertainty and multi-sensitive attributes. In: Handbook of Research on Computational Intelligence for Engineering, Science, and Business, pp. 364–383. IGI Global (2013)
Google Scholar
Friedman, A., Wolff, R., Schuster, A.: Providing k-anonymity in data mining. VLDB J. 17(4), 789–804 (2008)
Article Google Scholar
Kawano, A., Honda, K., Kasugai, H., Notsu, A.: A greedy algorithm for k-member co-clustering and its applicability to collaborative filtering. Procedia Comput. Sci. 22, 477–484 (2013)
Article Google Scholar
Kohonen, T.: Self-Organizing Maps. Springer, Heidelberg (2001). https://doi.org/10.1007/978-3-642-56927-2
Book MATH Google Scholar
Aggarwal, C.C.: Neural Networks and Deep Learning. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-94463-0
Book MATH Google Scholar
Dogan, Y., Birant, D., Kut, A.: SOM++: integration of self-organizing map and K-Means++ algorithms. In: Perner, P. (ed.) MLDM 2013. LNCS (LNAI), vol. 7988, pp. 246–259. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39712-7_19
Chapter Google Scholar
Flavius, G., Alfredo, C.J.: PartSOM: a framework for distributed data clustering using SOM and k-means. In: Self-Organizing Maps. IntechOpen, April 2010
Google Scholar
Tsiafoulis, S., Zorkadis, V.C., Karras, D.A.: A neural-network clustering-based algorithm for privacy preserving data mining. In: Kim, T., Yau, S.S., Gervasi, O., Kang, B.-H., Stoica, A., Ślęzak, D. (eds.) FGIT 2010. CCIS, vol. 121, pp. 269–276. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-17625-8_27
Chapter Google Scholar
Byun, J.-W., Sohn, Y., Bertino, E., Li, N.: Secure anonymization for incremental datasets. In: Jonker, W., Petković, M. (eds.) SDM 2006. LNCS, vol. 4165, pp. 48–63. Springer, Heidelberg (2006). https://doi.org/10.1007/11844662_4
Chapter Google Scholar
Zare-Mirakabad, M.-R., Jantan, A., Bressan, S.: Privacy risk diagnosis: mining l-diversity. In: Chen, L., Liu, C., Liu, Q., Deng, K. (eds.) DASFAA 2009. LNCS, vol. 5667, pp. 216–230. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04205-8_19
Chapter Google Scholar
Wang, K., Fung, B.C.M.: Anonymizing sequential releases. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2006, New York, NY, USA, pp. 414–423. ACM (2006)
Google Scholar
Gong, Q.: Clustering based k-anonymization. MIT License, January 2016
Google Scholar
Mishra, A.; Metrics to evaluate your machine learning algorithm (2018). https://towardsdatascience.com/metrics-to-evaluate-your-machine-learning-algorithm-f10ba6e38234

Download references

Author information

Authors and Affiliations

Cyber Technology Institute, The Gateway, De Montfort University, Leicester, LE1 9BH, UK
Kabiru Mohammed, Aladdin Ayesh & Eerke Boiten

Authors

Kabiru Mohammed
View author publications
You can also search for this author in PubMed Google Scholar
Aladdin Ayesh
View author publications
You can also search for this author in PubMed Google Scholar
Eerke Boiten
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kabiru Mohammed .

Editor information

Editors and Affiliations

Télécom SudParis, Evry Cedex, France
Joaquin Garcia-Alfaro
Departament d’Enginyeria de la Informació i de les Comunicacions, Universitat Autonoma de Barcelona, Bellaterra, Spain
Guillermo Navarro-Arribas
Escola d’Enginyeria, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Barcelona, Spain
Jordi Herrera-Joancomarti

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mohammed, K., Ayesh, A., Boiten, E. (2020). Utility Promises of Self-Organising Maps in Privacy Preserving Data Mining. In: Garcia-Alfaro, J., Navarro-Arribas, G., Herrera-Joancomarti, J. (eds) Data Privacy Management, Cryptocurrencies and Blockchain Technology. DPM CBT 2020 2020. Lecture Notes in Computer Science(), vol 12484. Springer, Cham. https://doi.org/10.1007/978-3-030-66172-4_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-66172-4_4
Published: 29 December 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-66171-7
Online ISBN: 978-3-030-66172-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics