[go: up one dir, main page]

Skip to main content

Improving Opinion Analysis Through Statistical Disclosure Control in eVoting Scenarios

  • Conference paper
  • First Online:
Electronic Government and the Information Systems Perspective (EGOVIS 2018)

Abstract

This work addresses the problem of Statistical Disclosure Control (SDC) on an electronic voting scenario. Electoral datasets containing voting choices linked to voters demographic profile information, can be used to perform fine-grained analysis of citizen opinion. However, it is strongly required to protect voters’ privacy. Traditional SDC techniques study methods to met some predefined privacy criteria, assuming a trustworthy owner that knows the values of the confidential attributes. Unfortunately, this assumption cannot be made in our scenario, since its dataset contains secret voting choices, which are unknown until they are properly anonymized and decrypted. We propose a protocol and a system architecture to perform SDC in datasets with encrypted attributes, while minimizing the amount of information an attacker can learn about the secret data. The protocol enables the release of electoral datasets, which allow governments and third parties to gain more insight into citizen opinion, and improve decision making processes and public services.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bernhard, D., Cortier, V., Galindo, D., Pereira, O., Warinschi, B.: SoK: a comprehensive analysis of game-based ballot privacy definitions. In: Proceedings of the IEEE Symposium on Security and Privacy (S&P), San Jose, CA, pp. 499–516, May 2015

    Google Scholar 

  2. Bernhard, D., Warinschi, B.: Cryptographic voting — a gentle introduction. In: Aldini, A., Lopez, J., Martinelli, F. (eds.) FOSAD 2012-2013. LNCS, vol. 8604, pp. 167–211. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10082-1_7

    Chapter  Google Scholar 

  3. Chaum, D.: Untraceable electronic mail, return addresses, and digital pseudonyms. Commun. ACM 24(2), 84–88 (1981)

    Article  Google Scholar 

  4. Cramer, R., Gennaro, R., Schoenmakers, B.: A secure and optimally efficient multi-authority election scheme. In: Fumy, W. (ed.) EUROCRYPT 1997. LNCS, vol. 1233, pp. 103–118. Springer, Heidelberg (1997). https://doi.org/10.1007/3-540-69053-0_9

    Chapter  Google Scholar 

  5. Delaune, S., Kremer, S., Ryan, M.D.: Verifying privacy-type properties of electronic voting protocols. J. Comput. Secur. 17(4), 435–487 (2009)

    Article  Google Scholar 

  6. Domingo-Ferrer, J., Mateo-Sanz, J.M.: Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans. Knowl. Data Eng. 14(1), 189–201 (2002)

    Article  Google Scholar 

  7. Domingo-Ferrer, J., Torra, V.: Ordinal, continuous and heterogeneous \(k\)-anonymity through microaggregation. Data Min. Knowl. Discov. 11(2), 195–212 (2005)

    Article  MathSciNet  Google Scholar 

  8. Fujioka, A., Okamoto, T., Ohta, K.: A practical secret voting scheme for large scale elections. In: Seberry, J., Zheng, Y. (eds.) AUSCRYPT 1992. LNCS, vol. 718, pp. 244–251. Springer, Heidelberg (1993). https://doi.org/10.1007/3-540-57220-1_66

    Chapter  Google Scholar 

  9. Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Am. Stat. Assoc. 58(301), 13–30 (1963)

    Article  MathSciNet  Google Scholar 

  10. Hundepool, A., et al.: Statistical Disclosure Control. Surv. Method. Wiley, Chichester (2012)

    Book  Google Scholar 

  11. Jonker, H., Mauw, S., Pang, J.: Privacy and verifiability in voting systems: methods, developments and trends. Comput. Sci. Rev. 10, 1–30 (2013)

    Article  Google Scholar 

  12. Li, N., Li, T., Venkatasubramanian, S.: \(t\)-closeness: privacy beyond \(k\)-anonymity and \(l\)-diversity. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE), Istanbul, Turkey, pp. 106–115, April 2007

    Google Scholar 

  13. Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: \(l\)-diversity: privacy beyond \(k\)-anonymity. In: Proceedings of the IEEE International Conference on Data Engineering (ICDE), p. 24, Apr 2006

    Google Scholar 

  14. Neff, C.A.: A verifiable secret shuffle and its application to E-voting. In: Proceedings of the ACM Conference on Computer and Communications Security (CCS), Philadelphia, PA, USA, pp. 116–125 (2001)

    Google Scholar 

  15. Oganian, A., Domingo-Ferrer, J.: On the complexity of optimal microaggregation for statistical disclosure control. UNECE Stat. J. 18(4), 345–354 (2001)

    Google Scholar 

  16. Rebollo-Monedero, D., Forné, J., Soriano, M.: \(p\)-probabilistic \(k\)-anonymous microaggregation for the anonymization of surveys with uncertain participation (2016, submitted)

    Google Scholar 

  17. Rebollo-Monedero, D., Forné, J., Soriano, M., Puiggalí-Allepuz, J.: \(k\)-anonymous microaggregation with preservation of statistical dependence. Inf. Sci. 342(1), 1–23 (2016)

    Article  MathSciNet  Google Scholar 

  18. Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical report, Computer Science Laboratory, SRI International (1998)

    Google Scholar 

  19. Truta, T.M., Vinay, B.: Privacy protection: \(p\)-sensitive \(k\)-anonymity property. In: Proceedings of the International Workshop on Privacy Data Management (PDM), p. 94. IEEE Computer Society (2006)

    Google Scholar 

Download references

Acknowledgments

The authors would like to thank Xavier Alsina and Alexey Akimov for their collaboration and helpful comments. This work has been partly supported by the Spanish Ministry of Industry, Energy and Tourism (MINETUR) through the “Acción Estratégica Economía y Sociedad Digital (AEESD)” funding plan, through Project ref. TSI-100202-2013-23 “Data-Distortion Framework (DDF).”

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jordi Cucurull .

Editor information

Editors and Affiliations

Appendix: Estimation of the Cluster Size

Appendix: Estimation of the Cluster Size

In order to execute the microaggregation algorithm in Step 3, an appropriate value of the cluster size must be estimated using the SDC_KEst algorithm.

Our procedure assumes that the votes are cast under a common voting profile \(\pi =(\pi _1,\dots ,\pi _m)\). That is, each voter chooses option i with probability \(\pi _i\). This information can be provided, e.g., by the Statistical Data Provider in the form of surveys or exit polls. We acknowledge that this is a quite strong assumption, but it is a necessary first step toward the problem nonetheless.

Let \(X_k\) be the m-ary vector r.v. that counts the number of votes cast for each of the m voting options in a cluster of size k. Therefore, \(X_k\) follows a multinomial distribution with parameters k and \(\pi \). Additionally, let \(W_k={{\mathrm{wt}}}(X_k)\) denote the hamming weight of \(X_k\), i.e., the number of nonzero positions in \(X_k\). Then, a cluster fails to satisfy the p-sensitive privacy property whenever \(W_k< p\). This event occurs with probability

$$ \rho (k) = 1 - \Pr \{W_k \ge p\}=1 - \sum _{x:{{\mathrm{wt}}}(x)\ge p} \frac{k!}{x_1!\cdots x_m!}\pi _1^{x_1}\dots \pi _m^{x_m}, $$

where the sum runs over all the m-ary vectors x of nonnegative integers whose components sum k. It is easy to see that \(\Pr \{W_k \ge p\}\ge \Pr \{W_{k-1} \ge p\}\), which signifies that \(\rho (k)\) is a decreasing function of k, as the intuition suggests.

Our aim is to estimate a cluster size \(k_\text {est}\) so that, with probability at least \(1-\epsilon \), a proportion of \({\ge }1-\delta \) clusters satisfy the p-sensitive k-anonymity privacy criterion, for sufficiently small values of \(\epsilon \) and \(\delta >\rho (k)\). For a total of \(N=n/k\) clusters, let \(N^*\) be the number of clusters satisfying the privacy criterion. Hence, using the Chernoff bound [9], we see that

$$\begin{aligned} \Pr \{N^*\ge N(1-\delta )\}\ge 1 - e^{-N {{\mathrm{D}}}(\delta \Vert \rho (k))}, \end{aligned}$$
(3)

where \({{\mathrm{D}}}(\delta \Vert \rho (k))\) denotes the Kullback-Leibler divergence between two Bernoulli distributed r.v.’s with parameters \(\delta \) and \(\rho (k)\), respectively. Therefore, the SDC_KEst algorithm obtain an estimation of the cluster size as

$$ k_{\text {est}} = \max \{k, \min \{k' : e^{- \frac{n}{k} {{\mathrm{D}}}(\delta \Vert \rho (k'))} \le \epsilon \}\}. $$

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Blasco, P., Moreira, J., Puiggalí, J., Cucurull, J., Rebollo-Monedero, D. (2018). Improving Opinion Analysis Through Statistical Disclosure Control in eVoting Scenarios. In: Kő, A., Francesconi, E. (eds) Electronic Government and the Information Systems Perspective. EGOVIS 2018. Lecture Notes in Computer Science(), vol 11032. Springer, Cham. https://doi.org/10.1007/978-3-319-98349-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-98349-3_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-98348-6

  • Online ISBN: 978-3-319-98349-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics