Abstract
Private stream aggregation (PSA) allows an untrusted data aggregator to compute statistics over a set of multiple participants’ data while ensuring the data remains private. Existing works rely on a trusted third party to enable an aggregator to achieve fault tolerance, that requires interactive recovery, but in the real world this may not be practical or secure. We develop a new formal framework for PSA that accounts for user faults, and can support non-interactive recovery, while still supporting strong individual privacy guarantees. We first must define a new level of security in the presence of faults and malicious adversaries because the existing definitions do not account for faults and the security implications of the recovery. After this we develop the first protocol that provably reaches this level of security, i.e., individual inputs are private even after the aggregator’s recovery, and reach new levels of scalability and communication efficiency over existing work seeking to support fault tolerance. The techniques we develop are general, and can be used to augment any PSA scheme to support non-interactive fault recovery.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bao, H., Lu, R.: DDPFT: secure data aggregation scheme with differential privacy and fault tolerance. In: IEEE ICC 2015, pp. 7240–7245. IEEE (2015)
Bao, H., Lu, R.: A new differentially private data aggregation with fault tolerance for smart grid communications. IEEE IoT-J 2(3), 248–258 (2015)
Bao, H., Lu, R.: A lightweight data aggregation scheme achieving privacy preservation and data integrity with differential privacy and fault tolerance. Peer Peer Netw. Appl. 10(1), 106–121 (2017)
Becker, D., Guajardo, J., Zimmermann, K.H.: Revisiting private stream aggregation: lattice-based PSA. In: NDSS (2018)
Boneh, D., Goh, E.-J., Nissim, K.: Evaluating 2-DNF formulas on ciphertexts. In: Kilian, J. (ed.) TCC 2005. LNCS, vol. 3378, pp. 325–341. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-30576-7_18
Chan, T.-H.H., Shi, E., Song, D.: Privacy-preserving stream aggregation with fault tolerance. In: Keromytis, A.D. (ed.) FC 2012. LNCS, vol. 7397, pp. 200–214. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32946-3_15
Chen, J., Ma, H., Zhao, D.: Private data aggregation with integrity assurance and fault tolerance for mobile crowd-sensing. Wireless Netw. 23(1), 131–144 (2017)
Chen, L., Lu, R., Cao, Z.: PDAFT: a privacy-preserving data aggregation scheme with fault tolerance for smart grid communications. Peer Peer Netw. Appl. 8(6), 1122–1132 (2015)
Chotard, J., Dufour Sans, E., Gay, R., Phan, D.H., Pointcheval, D.: Decentralized multi-client functional encryption for inner product. In: Peyrin, T., Galbraith, S. (eds.) ASIACRYPT 2018. LNCS, vol. 11273, pp. 703–732. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-03329-3_24
Gillin, D.: The federal trade commission and internet privacy. Mark. Res. 12(3), 39 (2000)
Han, S., Zhao, S., Li, Q., Ju, C.H., Zhou, W.: PPM-HDA: privacy-preserving and multifunctional health data aggregation with fault tolerance. IEEE Trans. Inf. Forensics Secur. 11(9), 1940–1955 (2015)
Joye, M., Libert, B.: A scalable scheme for privacy-preserving aggregation of time-series data. In: Sadeghi, A.-R. (ed.) FC 2013. LNCS, vol. 7859, pp. 111–125. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39884-1_10
Jung, T., Mao, X., Li, X., Tang, S., Gong, W., Zhang, L.: Privacy-preserving data aggregation without secure channel: multivariate polynomial evaluation. In: IEEE INFOCOM (2013)
Jung, T., Han, J., Li, X.Y.: PDA: semantically secure time-series data analytics with dynamic subgroups. TDSC 15(2), 260–274 (2016)
Karl, R., Burchfield, T., Takeshita, J., Jung, T.: Non-interactive MPC with trusted hardware secure against residual function attacks. In: Chen, S., Choo, K.-K.R., Fu, X., Lou, W., Mohaisen, A. (eds.) SecureComm 2019. LNICST, vol. 305, pp. 425–439. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-37231-6_25
Li, Q., Cao, G.: Efficient privacy-preserving stream aggregation in mobile sensing with low aggregation error. In: De Cristofaro, E., Wright, M. (eds.) PETS 2013. LNCS, vol. 7981, pp. 60–81. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39077-7_4
Martin, D.: Intel Xeon ice lake CPUs to get SGX with expanded security features (2020)
Mofrad, S., Zhang, F., Lu, S., Shi, W.: A comparison study of intel SGX and AMD memory encryption technology. In: ACM HASP, pp. 1–8 (2018)
Ni, J., Zhang, K., Alharbi, K., Lin, X., Zhang, N., Shen, X.S.: Differentially private smart metering with fault tolerance and range-based filtering. IEEE Trans. Smart Grid 8(5), 2483–2493 (2017)
Parikh, N., Sundaresan, N.: Scalable and near real-time burst detection from ecommerce queries. In: ACM SIGKDD, KDD 2008, pp. 972–980. ACM (2008)
Rabin, T.: A simplified approach to threshold and proactive RSA. In: Krawczyk, H. (ed.) CRYPTO 1998. LNCS, vol. 1462, pp. 89–104. Springer, Heidelberg (1998). https://doi.org/10.1007/BFb0055722
Russell, M.A.: Mining the Social Web. O’Reilly Media, Inc., Sebastopol (2011)
Shi, E., Chan, T.H., Rieffel, E., Chow, R., Song, D.: Privacy-preserving aggregation of time-series data. In: Proceedings of NDSS, vol. 2, pp. 1–17. Citeseer (2011)
Shi, J., Zhang, R., Liu, Y., Zhang, Y.: Prisense: privacy-preserving data aggregation in people-centric urban sensing systems. In: INFOCOM, pp. 1–9. IEEE (2010)
Valovich, F.: Aggregation of time-series data under differential privacy. In: Lange, T., Dunkelman, O. (eds.) LATINCRYPT 2017. LNCS, vol. 11368, pp. 249–270. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25283-0_14
Valovich, F., Aldà , F.: Computational differential privacy from lattice-based cryptography. In: Kaczorowski, J., Pieprzyk, J., Pomykała, J. (eds.) NuTMiC 2017. LNCS, vol. 10737, pp. 121–141. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-76620-1_8
Vargas, R.E.V., et al.: A realistic and public dataset with rare undesirable real events in oil wells. J. Pet. Sci. Eng. 181, 106223 (2019)
Wang, X., Liu, Y., Choo, K.: Fault tolerant, multi-subset aggregation scheme for smart grid. IEEE Trans. Ind. Inform. 17(6), 4065–4072 (2020)
Weichbrodt, N., Kurmus, A., Pietzuch, P., Kapitza, R.: AsyncShock: exploiting synchronisation bugs in intel SGX enclaves. In: Askoxylakis, I., Ioannidis, S., Katsikas, S., Meadows, C. (eds.) ESORICS 2016. LNCS, vol. 9878, pp. 440–457. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45744-4_22
Won, J., Ma, C.Y., Yau, D.K., Rao, N.S.: Proactive fault-tolerant aggregation protocol for private smart metering. In: INFOCOM, pp. 2804–2812. IEEE (2014)
Xue, K., Yang, Q., Li, S., Wei, D.S., Peng, M., Memon, I., Hong, P.: PPSO: a privacy-preserving service outsourcing scheme for real-time pricing demand response in smart grid. IEEE Internet Things J. 6(2), 2486–2496 (2018)
Acknowledgement
This work was supported by Facebook as a winner of the Role of Applied Cryptography in a Privacy-Focused Advertising Ecosystem Facebook RFP. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the sponsor.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
A Proof of Fault Tolerable Aggregator Obliviousness
A Proof of Fault Tolerable Aggregator Obliviousness
Theorem 1
Assuming that the Decisional Diffie-Hellman problem is hard in the group G and that the hash function H is a random oracle, then the above construction satisfies aggregator oblivious security with fault tolerance, as described in Definition 2.
Proof
First, we prove that the following intermediate game is difficult to win, given that Decisional Diffie-Hellman is hard. Let \(\mathbb {G}\) be a group of prime order p.
Setup: The challenger picks random generators \(g, h \in \) \(\mathbb {G},\) and random \(\alpha _{0}, \alpha _{1}, \ldots , \alpha _{n} \in \mathbb {Z}_{p}\) such that \(\sum _{i=0}^n \alpha _{i}=0 .\) The challenger gives the adversary: \(g, h, g^{\alpha _{0}}, g^{\alpha _{2}}, \ldots , g^{\alpha _{n}}\).
Queries: The adversary can compromise users adaptively and ask for the value of \(\alpha _{i}\). The challenger returns \(\alpha _{i}\) to the adversary when queried.
Challenge: The adversary selects an uncompromised set \(Q \subseteq \{0, \ldots , N\}\), and specifies a subset of Q denoted Y of users they claim faulted, where \(J = Y\) for the duration of the game. The challenger flips a random bit b. If \(b=0,\) the challenger returns to the adversary \(\left\{ h^{\alpha _{q}} \mid q \in Q \backslash Y\right\} , \left\{ h^{\alpha _{y}} \mid y \in Y\right\} \). If \(b=1,\) the challenger picks |Q|/|Y| random elements \(h_{q}^{\prime }\), for \(q \in Q/Y\) and |Y| random elements \(h_{y}^{\prime }\), for \(y \in Y\) from the group \(\mathbb {G},\) such that \(\sum _{q \in Q} h_{q}^{\prime } + \sum _{y \in Y} h_{y}^{\prime } =\sum _{q \in Q} h^{\alpha _{q}} + \sum _{y \in Y} h^{\alpha _{y}}\). The challenger returns \(h_{q}^{\prime }\), for \(q \in Q/Y\) and \(h_{y}^{\prime }\), for \(y \in Y\) to the adversary. The adversary can make additional compromise queries, as described in the above step as they see fit.
Guess: The adversary guesses either \(b=0\) or 1. The adversary wins if they have not asked for any \(\alpha _{q}\) for \(q \in Q,\) \(Y = J\), and if they successfully guess b.
Lemma 1
The above game is difficult for computationally bounded adversaries assuming Decisional Diffie Hellman is hard for group \(\mathbb {G}\).
We define the following sequence of hybrid games, and assume that the set Q specified by the adversary in the challenge stage is \(Q=\left\{ q_{1}, q_{2}, \ldots , q_{m}\right\} \). For simplicity, we write \(\left( \beta _{1}, \ldots , \beta _{m}\right) \) := \(\left( \alpha _{q_{1}}, \ldots , \alpha _{q_{m}}\right) ,\) and include Y within Q to save space. In \(Game_{d},\) the challenger sends the following to the adversary: \(R_{1}, R_{2}, \ldots , R_{d},\) \(h^{\beta _{d+1}},\) ..., \(h^{\beta _{m}}\). Here, each \(R_{q}(q \in [d])\) means an independent fresh random number, and the following condition holds: \(\prod _{1 \le q \le d} R_{q}=\prod _{1 \le q \le d} h^{\beta _{q}}\). Clearly \(Game_{1}\) is equivalent to the case when \(b=0\), and \(Game_{m-1}\) is equivalent to the case when \(b=1\). With the hybrid argument we can show that games \(Game_{d-1}\) and \(Game_{d}\) are computationally indistinguishable. To demonstrate this, we show that if, for some d, there exists a polynomial-time adversary \(\mathcal {A}\) who can distinguish between \(Game_{d-1}\) and \(Game_{d},\) we can then construct an algorithm \(\mathcal {B}\) which can solve the DDH problem.
Suppose \(\mathcal {B}\) obtains a DDH tuple \(\left( g, g^{x}, g^{l}, T\right) \). \(\mathcal {B}\)’s task is to decide whether \(T=g^{x l}\) or whether T is a random element from \(\mathbb {G}\). Now \(\mathcal {B}\) randomly guesses two indices e and b to be the \(d^{\mathrm {th}}\) and the \((d+1)^{\mathrm {th}}\) values of the set Q specified by the adversary in the challenge phase. The guess is correct with probability \(\frac{1}{N^{2}},\) and in case the guess is wrong, the algorithm \(\mathcal {B}\) aborts. Now \(\mathcal {B}\) picks random exponents \(\left\{ \alpha _{q}\right\} _{q \ne e, q \ne b}\) and sets \(\alpha _{b}=x\) and \(\alpha _{e}=-\sum _{q \ne e} \alpha _{q}\). Notice that \(\mathcal {B}\) does not know the values of \(\alpha _{e}\) and \(\alpha _{b},\) however, it can compute the values of \(g^{\alpha _{b}}=g^{x}\) and \(g^{\alpha _{e}}=\left( \prod _{q \ne e} g^{\alpha _{q}}\right) ^{-1}=\left( g^{x}\right) ^{-1}\). \(\prod _{q \ne e, q \ne b} g^{\alpha _{q}} \cdot \mathcal {B}\) gives \(\mathcal {A}\) the tuple \(\left( g, h=g^{l}, g^{\alpha _{1}}, \ldots , g^{\alpha _{n}}\right) \). If \(\mathcal {A}\) asks for any exponent except \(\alpha _{e}\) and \(\alpha _{b}, \mathcal {B}\) returns the corresponding \(\alpha _{q}\) value to \(\mathcal {A}\); if \(\mathcal {A}\) asks for \(\alpha _{e}\) or \(\alpha _{b},\) the algorithm \(\mathcal {B}\) aborts.
In the challenge phase, \(\mathcal {A}\) submits a set \(Q=\) \( \{q_{1}, q_{2}, \ldots q_{m} \} .\) If e and b are not the \(d^{\mathrm {th}}\) and the \((d+1)^{\mathrm {th}}\) values of the set Q, i.e., if \(q_{d} \ne e\) or \(q_{d+1} \ne b,\) the algorithm \(\mathcal {B}\) aborts. If \(q_{d}=e\) and \(q_{d+1}=b,\) then \(\mathcal {B}\) returns to \(\mathcal {A}\): \( R_{1}, R_{2}, \ldots , R_{d-1}\), \( (\prod _{q \notin \{q_{1}, \ldots , q_{d+1} \}} (g^{l} )^{\alpha _{q}} \cdot \prod _{q=1}^{d-1} R_{q} \cdot T )^{-1}\), T, and \( (g^{l} )^{\alpha _{{q}_{d+2}}, \ldots , (g^{l} )^{\alpha _{q_{m}}}}\). Clearly if \(T=g^{x l}\), then the above game is equivalent to \(Game_{d-1}\). Otherwise, if \(T \in _{R} \mathbb {G}\), then the above game is equivalent to \(Game_{d}\). Thus, if \(\mathcal {A}\) has a non-negligible advantage in guessing whether it is playing \(Game_{d-1}\) or \(Game_{d}\) and \(\mathcal {B}\) could solve the DDH problem with non-negligible advantage.
Now to prove the theorem, we will modify the aggregator oblivious security game. In the Encrypt queries, if the adversary submits a request for some tuple \((q, x, t^{*} )\) where \(t^{*}\) is the time step specified in the Challenge phase, the challenger treats this as a Compromise query, and simply returns the \(sk_{q}\) to the adversary. Given \(sk_{q},\) the adversary can compute the requested ciphertext. The adversary has access to a the functionality, FaultRecover, that can only be called once (since this is enforced via trusted hardware), which takes in a set of users that have not been compromised (\(j \in J\)), and returns the set of ciphertexts that correspond to those users encrypting 0. Note that this modification actually gives more power to the adversary. From now on, we will assume that the adversary does not make any Encrypt queries for the time \(t^{*}\).
Let \(K \subseteq N\) denote the set of compromised participants. Let \(\bar{K}:=N \backslash K\) denote the set of uncompromised participants. Since we assume the aggregator is untrusted, we are interested in the case where \(Q=\bar{K}\) or the aggregator key has been compromised. We must show that the adversary cannot distinguish whether the challenger returns a true encryption of the plaintext submitted in the challenge stage, or a random tuple with the same aggregation.
Given an adversary \(\mathcal {A}\) who can break the PSA game with non-negligible probability, we construct an algorithm \(\mathcal {B}\) that can solve the above intermediate problem with non-negligible probability. \(\mathcal {B}\) obtains from the challenger \(\mathcal {C}\) the tuple \(g, h, g^{\alpha _{0}}, g^{\alpha _{1}}, \ldots , g^{\alpha _{n}}\). \(\mathcal {B}\) sets \(\alpha _{0}\) to be the aggregator’s key, and \(\alpha _{1}, \ldots , \alpha _{n}\) to be the secret keys of participants 1 through n respectively. Note param is g.
Let \(q_{H}\) denote the total number of oracle queries made by the adversary \(\mathcal {A}\) and by the algorithm \(\mathcal {B}\) itself. \(\mathcal {B}\) guesses at random an index \(b \in \left[ q_{H}\right] \). Suppose the input to the \(b^{\text {th}}\) random oracle query is \(t^{*}\). The algorithm \(\mathcal {B}\) assumes that \(t^{*}\) will be the challenge time step. If the guess is found to be wrong later, \(\mathcal {B}\) aborts.
Hash Function Simulation: The adversary submits a hash query for the integer t. \(\mathcal {B}\) first checks the list \(\mathcal {L}\) to see if t has appeared in any entry (t, z) . If so, \(\mathcal {B}\) returns \(g^{z}\) to the adversary. Otherwise, if this is not the \(b^{\text {th}}\) query, \(\mathcal {B}\) picks a random exponent z and returns \(g^{z}\) to the adversary, and saves (t, z) to a list \(\mathcal {L}.\) For the \(b^{\text {th}}\) query, \(\mathcal {B}\) returns h.
Then the following Queries can take place:
-
\(\mathbf{Encrypt} \): The adversary \(\mathcal {A}\) submits an Encrypt query for the tuple (q, x, t) . In the modified version of the game, we ensure that \(t \ne t^{*},\) as otherwise, we simply treat it as a Compromise query. \(\mathcal {B}\) checks if a hash query has been made on t, and if not, \(\mathcal {B}\) makes a hash oracle query on t. Thus, \(\mathcal {B}\) learns the discrete log of H(t). Now \(H(t)=g^{z},\) so \(\mathcal {B}\) knows z, and since \(\mathcal {B}\) also knows \(g^{\alpha _{q}}, \mathcal {B}\) can compute the ciphertext \(g^{x} \cdot \left( g^{z}\right) ^{\alpha _{q}} \text{ as } g^{x} \cdot \left( g^{\alpha _{q}}\right) ^{z}\).
-
\(\mathbf{Compromise} \): \(\mathcal {B}\) forwards \(\mathcal {A}\)’s query to its own challenger \(\mathcal {C},\) and forwards the answer \(\alpha _{q}\) to \(\mathcal {A}\).
-
\(\mathbf{FaultRecover} \): \(\mathcal {B}\) forwards \(\mathcal {A}\)’s query to its own challenger \(\mathcal {C},\) and forwards the set of ciphertexts (i.e. \(\forall j \in J\), \(c \longleftarrow g^{0} \cdot H(t)^{s k_{j}}\))) to \(\mathcal {A}\).
Challenge: The adversary \(\mathcal {A}\) submits a set \(N = J \cup Q\) and a time \(t^{*},\) as well as plaintexts \(\left\{ x_{q} \mid q \in N\right\} \). If \(t^{*}\) does not agree with the value submitted in the \(b^{\text {th}}\) hash query, then \(\mathcal {B}\) aborts. \(\mathcal {B}\) submits the set Q in a Challenge query to its own challenger, and it obtains a tuple \(\left\{ T_{q}\right\} _{q \in N}\). The challenger returns the following ciphertexts to the adversary: \( \forall q \in Q: g^{x_{q}} \cdot T_{q}\) (i.e. \(c \longleftarrow g^{x_q} \cdot H(t)^{sk_{q}} \cdot T_q\)).
More Queries: Same as the Query stage.
Guess: If the adversary \(\mathcal {A}\) guesses that \(\mathcal {B}\) has returned a random tuple then \(\mathcal {B}\) guesses \(b^{\prime }=1\). Otherwise, \(\mathcal {B}\) guesses that \(b^{\prime }=0\)
If the challenger \(\mathcal {C}\) returns \(\mathcal {B}\) a faithful Diffie-Hellman tuple \(\forall q \in Q: T_{q}=h^{\alpha _{q}},\) then the ciphertext returned to the adversary \(\mathcal {A}\) is a true encryption of the plaintext submitted by the adversary. Otherwise, if the challenger returns to \(\mathcal {B}\) a random tuple, then the ciphertext returned to \(\mathcal {A}\) is random under the product constraint.
Rights and permissions
Copyright information
© 2021 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Karl, R., Takeshita, J., Jung, T. (2021). Cryptonite: A Framework for Flexible Time-Series Secure Aggregation with Non-interactive Fault Recovery. In: Garcia-Alfaro, J., Li, S., Poovendran, R., Debar, H., Yung, M. (eds) Security and Privacy in Communication Networks. SecureComm 2021. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 398. Springer, Cham. https://doi.org/10.1007/978-3-030-90019-9_16
Download citation
DOI: https://doi.org/10.1007/978-3-030-90019-9_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-90018-2
Online ISBN: 978-3-030-90019-9
eBook Packages: Computer ScienceComputer Science (R0)