Towards Efficient Kyber on FPGAs: A Processor for Vector of Polynomials

Published in ASP-DAC 2020, 2020

Recommended citation: Zhaohui Chen, Yuan Ma, Tianyu Chen, Jingqiang Lin, Jiwu Jing." Towards Efficient Kyber on FPGAs: A Processor for Vector of Polynomials," to appear in the 25th Asia and South Pacific Design Automation Conference (ASP DAC) 2020. http://cccisi.github.io/files/Kyber.pdf

Download paper here

Kyber is a promising candidate in post-quantum cryptography standardization process. In this paper, we propose a targeted optimization strategy and implement a processor for Kyber on FPGAs. By merging the operations, we cut off 29.4% clock cycles for Kyber512 and 33.3% for Kyber1024 compared with the textbook implementations. We utilize Gentlemen-Sande (GS) butterfly to optimize the Number-Theoretic Transform (NTT) implementation. The bottleneck of memory access is broken taking advantage of a dual-column sequential scheme. We further propose a pipeline architecture for better performance. The optimizations help the processor achieve 31684 NTT operations per second using only 477 LUTs, 237 FFs and 1 DSP. Our strategy is at least 3x more efficient than the state-of-the-art module for NTT with a similar security level.

Zhaohui Chen, Yuan Ma, Tianyu Chen, Jingqiang Lin, Jiwu Jing.” Towards Efficient Kyber on FPGAs: A Processor for Vector of Polynomials,” to appear in the 25th Asia and South Pacific Design Automation Conference (ASP DAC) 2020.