Abstract
Ring-LWE based homomorphic encryption computations in large depth use a combination of two techniques: 1) decomposition of big numbers into small limbs/digits, and 2) efficient cyclotomic multiplications modulo \(X^N+1\). It was long believed that the two mechanisms had to be strongly related, like in the full-RNS setting that uses a CRT decomposition of big numbers over an NTT-friendly family of prime numbers, and NTT over the same primes for multiplications. However, in this setting, NTT was the bottleneck of all large-depth FHE computations. A breakthrough result from Kim et al. (Crypto’2023) managed to overcome this limitation by introducing a second gadget decomposition and showing that it indeed shifts the bottleneck and renders the cost of NTT computations negligible compared to the rest of the computation. In this paper, we extend this result (far) beyond the Full-RNS settings and show that we can completely decouple the big number decomposition from the cyclotomic arithmetic aspects. As a result, we get modulus switching/rescaling for free. We verify both in theory and in practice that the performance of key-switching, external and internal products and automorphisms using our representation are faster than the one achieved by Kim et al., and we discuss the high impact of these results for low-level or hardware optimizations as well as the benefits of the new parametrizations for FHE compilers. We even manage to lower the running time of the gate bootstrapping of \({{\,\mathrm{\texttt {TFHE}}\,}}\) by eliminating one eighth of the FFTs and one sixth of the linear operations, which lowers the running time below 5.5ms on recent CPUs.
M. G. Belorgey, S. Carpov, and D. Jetchev—This work has been done by the authors while they were employed at Inpher.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
There are many equivalent ways to define the relinearization key. Here, we use an external product and a zero term in the \({{\,\mathrm{\texttt {TRLWE}}\,}}\) ciphertext, which can be propagated in the algorithm and simplified, the original references of Fan–Vercauteren [17] present a key-switching that substitute a key \(s^2\) by s.
- 2.
In [11] this ciphertext is called RLev, we could also name it RK(u) since it has the shape of a relinearization key. However the GSW name better depicts the fact that it is a ciphertext not some key material, and most importantly, the
morphism is half of the \(\boxdot \) operation.
- 3.
Originally, AVX2 was supporting mainly floating point operations with 32-bit floats and 64-bit doubles - it was much later that integer operations were introduced. Today, not all 64-bit integer operations are supported - e.g., it was only recently that SIMD multiplications of vectors of 64-bit integers were introduced in AVX512 and recently, Intel has disabled AVX512 in AlderLake processors.
References
M. R. Albrecht, R. Player, and S. Scott. On the concrete hardness of learning with errors. Journal of Mathematical Cryptology, 9(3):169–203, 2015.
J.-C. Bajard, J. Eynard, M. A. Hasan, and V. Zucca. A full RNS variant of FV like somewhat homomorphic encryption schemes. In Selected Areas in Cryptography – SAC 2016, pages 423–442. Springer, 2017.
M. G. Belorgey, S. Carpov, N. Gama, S. Guasch, and D. Jetchev. Revisiting key decomposition techniques for FHE: Simpler, faster and more generic. Cryptology ePrint Archive, Paper 2023/771, 2023.
C. Boura, N. Gama, M. Georgieva, and D. Jetchev. CHIMERA: combining ring-lwe-based fully homomorphic encryption schemes. J. Math. Cryptol., 14(1):316–338, 2020.
Z. Brakerski, C. Gentry, and V. Vaikuntanathan. (leveled) fully homomorphic encryption without bootstrapping. In Innovations in Theoretical Computer Science 2012, pages 309–325. ACM, 2012.
R. Cammarota. Intel HERACLES: homomorphic encryption revolutionary accelerator with correctness for learning-oriented end-to-end solutions. In Proceedings of the 2022 on Cloud Computing Security Workshop, CCSW. ACM, 2022.
J. H. Cheon, K. Han, A. Kim, M. Kim, and Y. Song. A full rns variant of approximate homomorphic encryption. Selected areas in cryptography : ... annual international workshop, SAC ... proceedings. SAC, 11349:347–368, 2018.
J. H. Cheon, A. Kim, M. Kim, and Y. S. Song. Homomorphic encryption for arithmetic of approximate numbers. In Advances in Cryptology - ASIACRYPT, Part I, pages 409–437. Springer, 2017.
I. Chillotti, N. Gama, M. Georgieva, and M. Izabachène. Faster fully homomorphic encryption: Bootstrapping in less than 0.1 seconds. In ASIACRYPT 2016, Proceedings, Part I, volume 10031 of LNCS, pages 3–33. Springer, 2016.
I. Chillotti, N. Gama, M. Georgieva, and M. Izabachène. TFHE: fast fully homomorphic encryption over the torus. J. Cryptol., 33(1):34–91, 2020.
I. Chillotti, D. Ligier, J.-B. Orfila, and S. Tap. Improved programmable bootstrapping with larger precision and efficient arithmetic circuits for tfhe. In M. Tibouchi and H. Wang, editors, Advances in Cryptology – ASIACRYPT 2021, pages 670–699, Cham, 2021. Springer International Publishing.
J. Cooley and J. Tukey. An algorithm for the machine calculation of complex fourier series. Mathematics of Computation, 19(90):297–301, 1965.
M. Creeger. The rise of fully homomorphic encryption: Often called the holy grail of cryptography, commercial FHE is near. ACM Queue, 20(4):39–60, 2022.
DARPA. DARPA:data protection in virtual environments (DPRIVE).
K. Derya, A. C. Mert, E. Öztürk, and E. Savas. Coha-ntt: A configurable hardware accelerator for ntt-based polynomial multiplication. Microprocess. Microsystems, 89:104451, 2022.
L. Ducas and D. Micciancio. FHEW: bootstrapping homomorphic encryption in less than a second. In EUROCRYPT 2015, Proceedings, Part I, volume 9056 of LNCS, pages 617–640. Springer, 2015.
J. Fan and F. Vercauteren. Somewhat practical fully homomorphic encryption. IACR Cryptol. ePrint Arch., page 144, 2012.
C. Gentry. Fully homomorphic encryption using ideal lattices. In Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, pages 169–178. ACM, 2009.
C. Gentry, S. Halevi, and N. P. Smart. Homomorphic evaluation of the aes circuit. In Advances in Cryptology – CRYPTO 2012, pages 850–867. Springer, 2012.
C. Gentry, A. Sahai, and B. Waters. Homomorphic encryption from learning with errors: Conceptually-simpler, asymptotically-faster, attribute-based. In Advances in Cryptology - CRYPTO 2013, Part I, pages 75–92. Springer, 2013.
C. Gentry, A. Sahai, and B. Waters. Homomorphic encryption from learning with errors: Conceptually-simpler, asymptotically-faster, attribute-based. In CRYPTO 2013, Proceedings, Part I, volume 8042 of LNCS, pages 75–92. Springer, 2013.
S. Halevi, Y. Polyakov, and V. Shoup. An improved rns variant of the bfv homomorphic encryption scheme. In Topics in Cryptology – CT-RSA 2019, pages 83–105. Springer, 2019.
M. Kim, D. Lee, J. Seo, and Y. Song. Accelerating he operations from key decomposition technique. In Advances in Cryptology – CRYPTO 2023: 43rd Annual International Cryptology Conference, Proceedings, Part IV. Springer-Verlag, 2023.
I. Kundu, E. Cottle, F. Michel, J. Wilson, and N. New. The dawn of energy efficient computing: Optically accelerating the fast fourier transform core. In Photonics in Switching and Computing 2021. Optica Publishing Group, 2021.
Y. Lee, S. Heo, S. Cheon, S. Jeong, C. Kim, E. Kim, D. Lee, and H. Kim. Hecate: Performance-aware scale optimization for homomorphic encryption compiler. In 2022 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pages 193–204, 2022.
V. Lyubashevsky, C. Peikert, and O. Regev. EUROCRYPT, chapter On Ideal Lattices and Learning with Errors over Rings, pages 1–23. Springer, 2010.
A. C. Mert, E. Öztürk, and E. Savas. FPGA implementation of a run-time configurable ntt-based polynomial multiplication hardware. Microprocess. Microsystems, 78:103219, 2020.
D. T. Nguyen and K. Gaj. Fast falcon signature generation and verification using armv8 neon instructions. In Progress in Cryptology - AFRICACRYPT 2023, pages 417–441. Springer, 2023.
Ö. Özerk, C. Elgezen, A. C. Mert, E. Öztürk, and E. Savas. Efficient number theoretic transform implementation on GPU for homomorphic encryption. J. Supercomput., 78(2):2840–2872, 2022.
O. Regev. On lattices, learning with errors, random linear codes, and cryptography. J. ACM, 56(6):34:1–34:40, 2009.
S. S. Roy, F. Turan, K. Järvinen, F. Vercauteren, and I. Verbauwhede. FPGA-based high-performance parallel architecture for homomorphic computing on encrypted data. In 25th IEEE International Symposium on High Performance Computer Architecture, pages 387–398. IEEE, 2019.
A. Schönhage and V. Strassen. Schnelle Multiplikation grosser Zahlen. Computing (Arch. Elektron. Rechnen), 7:281–292, 1971.
TFHE-lib. TFHE: Fast Fully Homomorphic Encryption over the Torus. https://tfhe.github.io/tfhe/.
F. Turan, S. S. Roy, and I. Verbauwhede. HEAWS: an accelerator for homomorphic encryption on the amazon AWS FPGA. IEEE Trans. Computers, 2020.
E. R. Türkoglu, A. S. Özcan, C. Ayduman, A. C. Mert, E. Öztürk, and E. Savas. An accelerated GPU library for homomorphic encryption operations of BFV scheme. In IEEE International Symposium on Circuits and Systems, 2022.
Zama. TFHE-rs: A Pure Rust Implementation of the TFHE Scheme for Boolean and Integer Arithmetics Over Encrypted Data, 2024. https://github.com/zama-ai/tfhe-rs.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
A Appendix: Normalization and Reduction Lemma Proof
A Appendix: Normalization and Reduction Lemma Proof
Proof
of Lemma 3.3. Let \(n = \lceil (L+M)/K\rceil \) be the number of algorithm iterations. Let \(\texttt {acc}^{(i)}\) be accumulator value after step 3 of iteration i and let \(\texttt {acc}^{(n+1)} = 0\). At iteration i, \(n\ge i \ge 1\), we have:
here \(\epsilon _{i} = \left\lfloor \texttt {acc}^{(i)} 2^{-K} \right\rfloor \) and has integer values. The evaluation of result polynomial at \(2^{-K}\) gives:
Expanding the sum and simplifying common expressions we obtain:
Now, we will prove that \(\left\| \left( \phi _K(A)-\phi _{K}(R)\right) \bmod \mathcal {R}\right\| _{\infty }\) is smaller than \(2^{-L}\). We have:
Observe that terms \(A_0\) and \(\epsilon _1\) are integers and are reduced by \(\bmod \mathcal {R}\) operation, we obtain:
From (9) it is easy to see that \(A_{i} - \epsilon _{i}2^{K} \equiv R_i - \epsilon _{i+1}\). Using previous relations we have:
Looking at the infinity norm we have:
Element-wise operations are on integers in interval \(\left( -2^{M+1},2^{M+1}\right) \).
The accumulator variable \(\texttt {acc}\) (at step 3) has the largest values during algorithm execution. We will prove that its magnitude (i.e. infinity norm) never exceeds \(2^{M+1}\).
Algorithm step 3 increases accumulator value by at most \(2^M\) and step 5 divides the new value by \(2^K\). After the first iteration, we have \(\left\| \texttt {acc}^{(n)}\right\| _\infty < 2^{M}\), after second iteration \(\left\| \texttt {acc}^{(n-1)}\right\| _\infty < 2^{M-K} + 2^{M}\) and so on until the last iteration where we have \(\left\| \texttt {acc}^{(1)}\right\| _\infty < \sum _{0 \le i < n} 2^{M-iK}\), which is the maximum accumulator magnitude attained during algorithm execution. We can rewrite the last expression as:
which proves the accumulator bound.
Complexity Algorithm steps 3-5 are executed \(\lceil (L+M)/K\rceil \) times. In each iteration 5 operations are performed: an addition (step 3), 3 shifts (2 in step 4 and 1 in step 5) and a subtraction (step 4). The overall complexity of the algorithm is \(5 \cdot \lceil (L+M)/K\rceil \) element-wise operations. \(\square \)
In this paragraph we introduce Algorithm 2, a general conversion and normalization with two limb size K and \(\widetilde{K}\). This algorithm is a slightly more complex than Algorithm 1 with \(K = {\tilde{K}}\), since it must handle additional binary shifts to synchronize the limb sizes, and thus, the for loop on a single index k is replaced by two while loops and two indexes \(k,\tilde{k}\) that decrease at their respective speed. Besides that, it follows the same principle as the single-limb normalization.
Rights and permissions
Copyright information
© 2025 International Association for Cryptologic Research
About this paper
Cite this paper
Belorgey, M.G., Carpov, S., Gama, N., Guasch, S., Jetchev, D. (2025). Revisiting Key Decomposition Techniques for FHE: Simpler, Faster and More Generic. In: Chung, KM., Sasaki, Y. (eds) Advances in Cryptology – ASIACRYPT 2024. ASIACRYPT 2024. Lecture Notes in Computer Science, vol 15484. Springer, Singapore. https://doi.org/10.1007/978-981-96-0875-1_6
Download citation
DOI: https://doi.org/10.1007/978-981-96-0875-1_6
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-96-0874-4
Online ISBN: 978-981-96-0875-1
eBook Packages: Computer ScienceComputer Science (R0)