Revisiting Key Decomposition Techniques for FHE: Simpler, Faster and More Generic

M. G. Belorgey⁹,
S. Carpov¹⁰,
N. Gama¹¹,
S. Guasch¹¹ &
…
D. Jetchev^10,12,13

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15484))

Included in the following conference series:

International Conference on the Theory and Application of Cryptology and Information Security

310 Accesses

Abstract

Ring-LWE based homomorphic encryption computations in large depth use a combination of two techniques: 1) decomposition of big numbers into small limbs/digits, and 2) efficient cyclotomic multiplications modulo $X^N+1$. It was long believed that the two mechanisms had to be strongly related, like in the full-RNS setting that uses a CRT decomposition of big numbers over an NTT-friendly family of prime numbers, and NTT over the same primes for multiplications. However, in this setting, NTT was the bottleneck of all large-depth FHE computations. A breakthrough result from Kim et al. (Crypto’2023) managed to overcome this limitation by introducing a second gadget decomposition and showing that it indeed shifts the bottleneck and renders the cost of NTT computations negligible compared to the rest of the computation. In this paper, we extend this result (far) beyond the Full-RNS settings and show that we can completely decouple the big number decomposition from the cyclotomic arithmetic aspects. As a result, we get modulus switching/rescaling for free. We verify both in theory and in practice that the performance of key-switching, external and internal products and automorphisms using our representation are faster than the one achieved by Kim et al., and we discuss the high impact of these results for low-level or hardware optimizations as well as the benefits of the new parametrizations for FHE compilers. We even manage to lower the running time of the gate bootstrapping of ${{\,\mathrm{\texttt {TFHE}}\,}}$ by eliminating one eighth of the FFTs and one sixth of the linear operations, which lowers the running time below 5.5ms on recent CPUs.

M. G. Belorgey, S. Carpov, and D. Jetchev—This work has been done by the authors while they were employed at Inpher.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

On Polynomial Functions Modulo $$p^e$$ and Faster Bootstrapping for Homomorphic Encryption

Towards Practical Multi-key TFHE: Parallelizable, Key-Compatible, Quasi-linear Complexity

Revisiting Homomorphic Encryption Schemes for Finite Fields

Notes

1.
There are many equivalent ways to define the relinearization key. Here, we use an external product and a zero term in the ${{\,\mathrm{\texttt {TRLWE}}\,}}$ ciphertext, which can be propagated in the algorithm and simplified, the original references of Fan–Vercauteren [17] present a key-switching that substitute a key $s^2$ by s.
2.
In [11] this ciphertext is called RLev, we could also name it RK(u) since it has the shape of a relinearization key. However the GSW name better depicts the fact that it is a ciphertext not some key material, and most importantly, the morphism is half of the $\boxdot $ operation.
3.
Originally, AVX2 was supporting mainly floating point operations with 32-bit floats and 64-bit doubles - it was much later that integer operations were introduced. Today, not all 64-bit integer operations are supported - e.g., it was only recently that SIMD multiplications of vectors of 64-bit integers were introduced in AVX512 and recently, Intel has disabled AVX512 in AlderLake processors.

References

M. R. Albrecht, R. Player, and S. Scott. On the concrete hardness of learning with errors. Journal of Mathematical Cryptology, 9(3):169–203, 2015.
Google Scholar
J.-C. Bajard, J. Eynard, M. A. Hasan, and V. Zucca. A full RNS variant of FV like somewhat homomorphic encryption schemes. In Selected Areas in Cryptography – SAC 2016, pages 423–442. Springer, 2017.
Google Scholar
M. G. Belorgey, S. Carpov, N. Gama, S. Guasch, and D. Jetchev. Revisiting key decomposition techniques for FHE: Simpler, faster and more generic. Cryptology ePrint Archive, Paper 2023/771, 2023.
Google Scholar
C. Boura, N. Gama, M. Georgieva, and D. Jetchev. CHIMERA: combining ring-lwe-based fully homomorphic encryption schemes. J. Math. Cryptol., 14(1):316–338, 2020.
Google Scholar
Z. Brakerski, C. Gentry, and V. Vaikuntanathan. (leveled) fully homomorphic encryption without bootstrapping. In Innovations in Theoretical Computer Science 2012, pages 309–325. ACM, 2012.
Google Scholar
R. Cammarota. Intel HERACLES: homomorphic encryption revolutionary accelerator with correctness for learning-oriented end-to-end solutions. In Proceedings of the 2022 on Cloud Computing Security Workshop, CCSW. ACM, 2022.
Google Scholar
J. H. Cheon, K. Han, A. Kim, M. Kim, and Y. Song. A full rns variant of approximate homomorphic encryption. Selected areas in cryptography : ... annual international workshop, SAC ... proceedings. SAC, 11349:347–368, 2018.
Google Scholar
J. H. Cheon, A. Kim, M. Kim, and Y. S. Song. Homomorphic encryption for arithmetic of approximate numbers. In Advances in Cryptology - ASIACRYPT, Part I, pages 409–437. Springer, 2017.
Google Scholar
I. Chillotti, N. Gama, M. Georgieva, and M. Izabachène. Faster fully homomorphic encryption: Bootstrapping in less than 0.1 seconds. In ASIACRYPT 2016, Proceedings, Part I, volume 10031 of LNCS, pages 3–33. Springer, 2016.
Google Scholar
I. Chillotti, N. Gama, M. Georgieva, and M. Izabachène. TFHE: fast fully homomorphic encryption over the torus. J. Cryptol., 33(1):34–91, 2020.
Google Scholar
I. Chillotti, D. Ligier, J.-B. Orfila, and S. Tap. Improved programmable bootstrapping with larger precision and efficient arithmetic circuits for tfhe. In M. Tibouchi and H. Wang, editors, Advances in Cryptology – ASIACRYPT 2021, pages 670–699, Cham, 2021. Springer International Publishing.
Google Scholar
J. Cooley and J. Tukey. An algorithm for the machine calculation of complex fourier series. Mathematics of Computation, 19(90):297–301, 1965.
Google Scholar
M. Creeger. The rise of fully homomorphic encryption: Often called the holy grail of cryptography, commercial FHE is near. ACM Queue, 20(4):39–60, 2022.
Google Scholar
DARPA. DARPA:data protection in virtual environments (DPRIVE).
Google Scholar
K. Derya, A. C. Mert, E. Öztürk, and E. Savas. Coha-ntt: A configurable hardware accelerator for ntt-based polynomial multiplication. Microprocess. Microsystems, 89:104451, 2022.
Google Scholar
L. Ducas and D. Micciancio. FHEW: bootstrapping homomorphic encryption in less than a second. In EUROCRYPT 2015, Proceedings, Part I, volume 9056 of LNCS, pages 617–640. Springer, 2015.
Google Scholar
J. Fan and F. Vercauteren. Somewhat practical fully homomorphic encryption. IACR Cryptol. ePrint Arch., page 144, 2012.
Google Scholar
C. Gentry. Fully homomorphic encryption using ideal lattices. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, pages 169–178. ACM, 2009.
Google Scholar
C. Gentry, S. Halevi, and N. P. Smart. Homomorphic evaluation of the aes circuit. In Advances in Cryptology – CRYPTO 2012, pages 850–867. Springer, 2012.
Google Scholar
C. Gentry, A. Sahai, and B. Waters. Homomorphic encryption from learning with errors: Conceptually-simpler, asymptotically-faster, attribute-based. In Advances in Cryptology - CRYPTO 2013, Part I, pages 75–92. Springer, 2013.
Google Scholar
C. Gentry, A. Sahai, and B. Waters. Homomorphic encryption from learning with errors: Conceptually-simpler, asymptotically-faster, attribute-based. In CRYPTO 2013, Proceedings, Part I, volume 8042 of LNCS, pages 75–92. Springer, 2013.
Google Scholar
S. Halevi, Y. Polyakov, and V. Shoup. An improved rns variant of the bfv homomorphic encryption scheme. In Topics in Cryptology – CT-RSA 2019, pages 83–105. Springer, 2019.
Google Scholar
M. Kim, D. Lee, J. Seo, and Y. Song. Accelerating he operations from key decomposition technique. In Advances in Cryptology – CRYPTO 2023: 43rd Annual International Cryptology Conference, Proceedings, Part IV. Springer-Verlag, 2023.
Google Scholar
I. Kundu, E. Cottle, F. Michel, J. Wilson, and N. New. The dawn of energy efficient computing: Optically accelerating the fast fourier transform core. In Photonics in Switching and Computing 2021. Optica Publishing Group, 2021.
Google Scholar
Y. Lee, S. Heo, S. Cheon, S. Jeong, C. Kim, E. Kim, D. Lee, and H. Kim. Hecate: Performance-aware scale optimization for homomorphic encryption compiler. In 2022 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pages 193–204, 2022.
Google Scholar
V. Lyubashevsky, C. Peikert, and O. Regev. EUROCRYPT, chapter On Ideal Lattices and Learning with Errors over Rings, pages 1–23. Springer, 2010.
Google Scholar
A. C. Mert, E. Öztürk, and E. Savas. FPGA implementation of a run-time configurable ntt-based polynomial multiplication hardware. Microprocess. Microsystems, 78:103219, 2020.
Google Scholar
D. T. Nguyen and K. Gaj. Fast falcon signature generation and verification using armv8 neon instructions. In Progress in Cryptology - AFRICACRYPT 2023, pages 417–441. Springer, 2023.
Google Scholar
Ö. Özerk, C. Elgezen, A. C. Mert, E. Öztürk, and E. Savas. Efficient number theoretic transform implementation on GPU for homomorphic encryption. J. Supercomput., 78(2):2840–2872, 2022.
Google Scholar
O. Regev. On lattices, learning with errors, random linear codes, and cryptography. J. ACM, 56(6):34:1–34:40, 2009.
Google Scholar
S. S. Roy, F. Turan, K. Järvinen, F. Vercauteren, and I. Verbauwhede. FPGA-based high-performance parallel architecture for homomorphic computing on encrypted data. In 25th IEEE International Symposium on High Performance Computer Architecture, pages 387–398. IEEE, 2019.
Google Scholar
A. Schönhage and V. Strassen. Schnelle Multiplikation grosser Zahlen. Computing (Arch. Elektron. Rechnen), 7:281–292, 1971.
Google Scholar
TFHE-lib. TFHE: Fast Fully Homomorphic Encryption over the Torus. https://tfhe.github.io/tfhe/.
F. Turan, S. S. Roy, and I. Verbauwhede. HEAWS: an accelerator for homomorphic encryption on the amazon AWS FPGA. IEEE Trans. Computers, 2020.
Google Scholar
E. R. Türkoglu, A. S. Özcan, C. Ayduman, A. C. Mert, E. Öztürk, and E. Savas. An accelerated GPU library for homomorphic encryption operations of BFV scheme. In IEEE International Symposium on Circuits and Systems, 2022.
Google Scholar
Zama. TFHE-rs: A Pure Rust Implementation of the TFHE Scheme for Boolean and Integer Arithmetics Over Encrypted Data, 2024. https://github.com/zama-ai/tfhe-rs.

Download references

Author information

Authors and Affiliations

Tune Insight, Lausanne, Switzerland
M. G. Belorgey
Arcium, Baar, Switzerland
S. Carpov & D. Jetchev
SandboxAQ, Palo-Alto, USA
N. Gama & S. Guasch
IOG, Colorado, Costa Rica
D. Jetchev
EPFL, Lausanne, Switzerland
D. Jetchev

Authors

M. G. Belorgey
View author publications
You can also search for this author in PubMed Google Scholar
S. Carpov
View author publications
You can also search for this author in PubMed Google Scholar
N. Gama
View author publications
You can also search for this author in PubMed Google Scholar
S. Guasch
View author publications
You can also search for this author in PubMed Google Scholar
D. Jetchev
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to M. G. Belorgey or N. Gama .

Editor information

Editors and Affiliations

Academia Sinica, Taipei, Taiwan
Kai-Min Chung
NTT Social Informatics Laboratories, Tokyo, Japan
Yu Sasaki

A Appendix: Normalization and Reduction Lemma Proof

Proof

of Lemma 3.3. Let $n = \lceil (L+M)/K\rceil $ be the number of algorithm iterations. Let $\texttt {acc}^{(i)}$ be accumulator value after step 3 of iteration i and let $\texttt {acc}^{(n+1)} = 0$. At iteration i, $n\ge i \ge 1$, we have:

$$\begin{aligned} \texttt {acc}^{(i)} = A_{i} + \epsilon _{i+1} \text { and } R_{i} = \texttt {acc}^{(i)} - \epsilon _{i} 2^K, \end{aligned}$$

(9)

here $\epsilon _{i} = \left\lfloor \texttt {acc}^{(i)} 2^{-K} \right\rfloor $ and has integer values. The evaluation of result polynomial at $2^{-K}$ gives:

$$\begin{aligned} \phi _K\left( R\right) = \sum _{i=1}^{\ell } R_i 2^{-iK} = \sum _{i=1}^{\ell } \left( A_{i} + \epsilon _{i+1} - \epsilon _{i} 2^K \right) 2^{-iK}. \end{aligned}$$

Expanding the sum and simplifying common expressions we obtain:

$$\begin{aligned} \phi _K\left( R\right) = \sum _{i=1}^{\ell } A_{i}2^{-iK} + \epsilon _{\ell +1} 2^{-\ell K} - \epsilon _1 \end{aligned}$$

Now, we will prove that $\left\| \left( \phi _K(A)-\phi _{K}(R)\right) \bmod \mathcal {R}\right\| _{\infty }$ is smaller than $2^{-L}$. We have:

$$ \phi _K(A)-\phi _{K}(R) = A_0 + \sum _{\ell < i} A_i2^{-iK} - \epsilon _{(\ell +1)} 2^{-\ell K} + \epsilon _1. $$

Observe that terms $A_0$ and $\epsilon _1$ are integers and are reduced by $\bmod \mathcal {R}$ operation, we obtain:

$$ \left( \phi _K(A)-\phi _{K}(R)\right) \bmod \mathcal {R}= \sum _{\ell < i} A_i2^{-iK} - \epsilon _{\ell +1} 2^{-\ell K}. $$

From (9) it is easy to see that $A_{i} - \epsilon _{i}2^{K} \equiv R_i - \epsilon _{i+1}$. Using previous relations we have:

$$\begin{aligned} \left( \phi _K(A)-\phi _{K}(R)\right) \bmod \mathcal {R}= 2^{-(\ell + 1) K} (A_{\ell + 1} - \epsilon _{\ell +1}2^K) + \sum _{\ell +1 < i} A_i2^{-iK} \\ = 2^{-(\ell + 1) K} R_{\ell +1} + \sum _{\ell +1 < i} A_i2^{-iK} - \epsilon _{\ell +2} 2^{-(\ell + 1) K} = \dots = \sum _{i=\ell +1}^{n} R_i 2^{-iK} + \sum _{n < i} A_i 2^{-iK}. \end{aligned}$$

Looking at the infinity norm we have:

$$\begin{aligned} \left\| \left( \phi _K(A)-\phi _{K}(R)\right) \bmod \mathcal {R}\right\| _{\infty } = \left\| \sum _{i=\ell +1}^{n} R_i 2^{-iK} + \sum _{n < i} A_i 2^{-iK}\right\| _{\infty } \\ < \left\| \sum _{i=\ell +1}^{n} 2^{-iK + K} + \sum _{n < i} 2^{-iK+M} \right\| _{\infty } \le 2^{-\ell K + 1} \le 2^{-L}. \end{aligned}$$

Element-wise operations are on integers in interval $\left( -2^{M+1},2^{M+1}\right) $.

The accumulator variable $\texttt {acc}$ (at step 3) has the largest values during algorithm execution. We will prove that its magnitude (i.e. infinity norm) never exceeds $2^{M+1}$.

Algorithm step 3 increases accumulator value by at most $2^M$ and step 5 divides the new value by $2^K$. After the first iteration, we have $\left\| \texttt {acc}^{(n)}\right\| _\infty < 2^{M}$, after second iteration $\left\| \texttt {acc}^{(n-1)}\right\| _\infty < 2^{M-K} + 2^{M}$ and so on until the last iteration where we have $\left\| \texttt {acc}^{(1)}\right\| _\infty < \sum _{0 \le i < n} 2^{M-iK}$, which is the maximum accumulator magnitude attained during algorithm execution. We can rewrite the last expression as:

$$ \left\| \texttt {acc}^{(1)}\right\| _\infty < \sum _{0 \le i < n} 2^{M-iK} = 2^M \cdot \sum _{0 \le i < n} 2^{-iK} < 2^M \cdot 2 = 2^{M+1}, $$

which proves the accumulator bound.

Complexity Algorithm steps 3-5 are executed $\lceil (L+M)/K\rceil $ times. In each iteration 5 operations are performed: an addition (step 3), 3 shifts (2 in step 4 and 1 in step 5) and a subtraction (step 4). The overall complexity of the algorithm is $5 \cdot \lceil (L+M)/K\rceil $ element-wise operations. $\square $

In this paragraph we introduce Algorithm 2, a general conversion and normalization with two limb size K and $\widetilde{K}$. This algorithm is a slightly more complex than Algorithm 1 with $K = {\tilde{K}}$, since it must handle additional binary shifts to synchronize the limb sizes, and thus, the for loop on a single index k is replaced by two while loops and two indexes $k,\tilde{k}$ that decrease at their respective speed. Besides that, it follows the same principle as the single-limb normalization.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Belorgey, M.G., Carpov, S., Gama, N., Guasch, S., Jetchev, D. (2025). Revisiting Key Decomposition Techniques for FHE: Simpler, Faster and More Generic. In: Chung, KM., Sasaki, Y. (eds) Advances in Cryptology – ASIACRYPT 2024. ASIACRYPT 2024. Lecture Notes in Computer Science, vol 15484. Springer, Singapore. https://doi.org/10.1007/978-981-96-0875-1_6

Download citation

DOI: https://doi.org/10.1007/978-981-96-0875-1_6
Published: 10 December 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-0874-4
Online ISBN: 978-981-96-0875-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Revisiting Key Decomposition Techniques for FHE: Simpler, Faster and More Generic

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

On Polynomial Functions Modulo $$p^e$$ and Faster Bootstrapping for Homomorphic Encryption

Towards Practical Multi-key TFHE: Parallelizable, Key-Compatible, Quasi-linear Complexity

Revisiting Homomorphic Encryption Schemes for Finite Fields

Notes

References

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

A Appendix: Normalization and Reduction Lemma Proof

A Appendix: Normalization and Reduction Lemma Proof

Proof

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us