Loading…

Unif-NTT: A Unified Hardware Design of Forward and Inverse NTT for PQC Algorithms

Polynomial multiplications based on the number theoretic transform (NTT) are critical in lattice-based post-quantum cryptography algorithms. Therefore, this paper presents a platform-agnostic unified hardware accelerator design (Unif-NTT) to compute the forward and inverse operations of the NTT for...

Full description

Saved in:
Bibliographic Details
Published in:IEEE access 2024, Vol.12, p.94793-94804
Main Authors: Yahya Hummdi, Ali, Aljaedi, Amer, Bassfar, Zaid, Shaukat Jamal, Sajjad, Mazyad Hazzazi, Mohammad, Rehman, Mujeeb Ur
Format: Article
Language:English
Subjects:
Citations: Items that this one cites
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
cited_by
cites cdi_FETCH-LOGICAL-c216t-59bccc3c7813797949bbbc486771ac654e826aaef9343d02147c601674ba8eb83
container_end_page 94804
container_issue
container_start_page 94793
container_title IEEE access
container_volume 12
creator Yahya Hummdi, Ali
Aljaedi, Amer
Bassfar, Zaid
Shaukat Jamal, Sajjad
Mazyad Hazzazi, Mohammad
Rehman, Mujeeb Ur
description Polynomial multiplications based on the number theoretic transform (NTT) are critical in lattice-based post-quantum cryptography algorithms. Therefore, this paper presents a platform-agnostic unified hardware accelerator design (Unif-NTT) to compute the forward and inverse operations of the NTT for the CRYSTALS-Kyber algorithm. Moreover, a unified design (Unif-BU) of the Cooley-Tukey and Gentleman-Sande butterflies is presented using two adders, multipliers, subtractors, routing multiplexers and barret-based modular reduction units. Finally, a dedicated controller is implemented for efficient control functionalities. The implementation results are realized on field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) platforms. The Unif-NTT requires 1664 and 1792 clock cycles for one forward and inverse NTT computations, respectively. It can operate up to a maximum frequency of 212MHz and 2.5GHz over Virtex-7 FPGA and 28nm ASIC platforms, respectively. The Unif-NTT is 26% more efficient in Area-Time-Product compared to the most area-optimized NTT accelerator from the state-of-the-art. The Unif-NTT design is suited for applications that demand reasonable hardware resources with processing speed.
doi_str_mv 10.1109/ACCESS.2024.3425813
format article
fullrecord <record><control><sourceid>doaj_cross</sourceid><recordid>TN_cdi_crossref_primary_10_1109_ACCESS_2024_3425813</recordid><sourceformat>XML</sourceformat><sourcesystem>PC</sourcesystem><ieee_id>10591972</ieee_id><doaj_id>oai_doaj_org_article_9378eaf82d44495cb289f0ce851e7e2e</doaj_id><sourcerecordid>oai_doaj_org_article_9378eaf82d44495cb289f0ce851e7e2e</sourcerecordid><originalsourceid>FETCH-LOGICAL-c216t-59bccc3c7813797949bbbc486771ac654e826aaef9343d02147c601674ba8eb83</originalsourceid><addsrcrecordid>eNpNkF9LwzAUxYsoOOY-gT7kC3TmX5vEt1I3Nxjq2PYc0vRmdmytJEPx25vZIbsv93DgHA6_JLkneEwIVo9FWU5WqzHFlI8Zp5kk7CoZUJKrlGUsv77Qt8kohB2OJ6OViUGy3LSNS1_X6ydUoJNuoEYz4-tv4wE9Q2i2LeocmnY-OjUybY3m7Rf4ACimkOs8el-WqNhvO98cPw7hLrlxZh9gdP7DZDOdrMtZunh7mZfFIrVxzjHNVGWtZVbEuUIJxVVVVZbLXAhibJ5xkDQ3BpxinNWYEi5sjkkueGUkVJINk3nfW3dmpz99czD-R3em0X9G57fa-GNj96AVExKMk7TmnKvMVlQqhy3IjIAACrGL9V3WdyF4cP99BOsTZN1D1ifI-gw5ph76VAMAF4lMESUo-wVt1nYy</addsrcrecordid><sourcetype>Open Website</sourcetype><iscdi>true</iscdi><recordtype>article</recordtype></control><display><type>article</type><title>Unif-NTT: A Unified Hardware Design of Forward and Inverse NTT for PQC Algorithms</title><source>IEEE Xplore Open Access Journals</source><creator>Yahya Hummdi, Ali ; Aljaedi, Amer ; Bassfar, Zaid ; Shaukat Jamal, Sajjad ; Mazyad Hazzazi, Mohammad ; Rehman, Mujeeb Ur</creator><creatorcontrib>Yahya Hummdi, Ali ; Aljaedi, Amer ; Bassfar, Zaid ; Shaukat Jamal, Sajjad ; Mazyad Hazzazi, Mohammad ; Rehman, Mujeeb Ur</creatorcontrib><description><![CDATA[Polynomial multiplications based on the number theoretic transform (NTT) are critical in lattice-based post-quantum cryptography algorithms. Therefore, this paper presents a platform-agnostic unified hardware accelerator design (Unif-NTT) to compute the forward and inverse operations of the NTT for the CRYSTALS-Kyber algorithm. Moreover, a unified design (Unif-BU) of the Cooley-Tukey and Gentleman-Sande butterflies is presented using two adders, multipliers, subtractors, routing multiplexers and barret-based modular reduction units. Finally, a dedicated controller is implemented for efficient control functionalities. The implementation results are realized on field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) platforms. The Unif-NTT requires 1664 and 1792 clock cycles for one forward and inverse NTT computations, respectively. It can operate up to a maximum frequency of <inline-formula> <tex-math notation="LaTeX">212MHz </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">2.5GHz </tex-math></inline-formula> over Virtex-7 FPGA and 28nm ASIC platforms, respectively. The Unif-NTT is 26% more efficient in Area-Time-Product compared to the most area-optimized NTT accelerator from the state-of-the-art. The Unif-NTT design is suited for applications that demand reasonable hardware resources with processing speed.]]></description><identifier>ISSN: 2169-3536</identifier><identifier>EISSN: 2169-3536</identifier><identifier>DOI: 10.1109/ACCESS.2024.3425813</identifier><identifier>CODEN: IAECCG</identifier><language>eng</language><publisher>IEEE</publisher><subject>accelerator ; Adders ; ASIC ; Clocks ; Computer architecture ; Cryptography ; Field programmable gate arrays ; FPGA ; hardware ; Hardware acceleration ; Number theoretic transform ; Polynomials ; post-quantum cryptography ; Signal processing algorithms ; Throughput</subject><ispartof>IEEE access, 2024, Vol.12, p.94793-94804</ispartof><lds50>peer_reviewed</lds50><oa>free_for_read</oa><woscitedreferencessubscribed>false</woscitedreferencessubscribed><cites>FETCH-LOGICAL-c216t-59bccc3c7813797949bbbc486771ac654e826aaef9343d02147c601674ba8eb83</cites><orcidid>0000-0003-4099-5025 ; 0000-0002-7945-9994 ; 0000-0003-1172-885X ; 0000-0002-5852-1955 ; 0009-0006-9570-4714 ; 0000-0002-8154-6560</orcidid></display><links><openurl>$$Topenurl_article</openurl><openurlfulltext>$$Topenurlfull_article</openurlfulltext><thumbnail>$$Tsyndetics_thumb_exl</thumbnail><linktohtml>$$Uhttps://ieeexplore.ieee.org/document/10591972$$EHTML$$P50$$Gieee$$Hfree_for_read</linktohtml><link.rule.ids>314,780,784,4022,27632,27922,27923,27924,54932</link.rule.ids></links><search><creatorcontrib>Yahya Hummdi, Ali</creatorcontrib><creatorcontrib>Aljaedi, Amer</creatorcontrib><creatorcontrib>Bassfar, Zaid</creatorcontrib><creatorcontrib>Shaukat Jamal, Sajjad</creatorcontrib><creatorcontrib>Mazyad Hazzazi, Mohammad</creatorcontrib><creatorcontrib>Rehman, Mujeeb Ur</creatorcontrib><title>Unif-NTT: A Unified Hardware Design of Forward and Inverse NTT for PQC Algorithms</title><title>IEEE access</title><addtitle>Access</addtitle><description><![CDATA[Polynomial multiplications based on the number theoretic transform (NTT) are critical in lattice-based post-quantum cryptography algorithms. Therefore, this paper presents a platform-agnostic unified hardware accelerator design (Unif-NTT) to compute the forward and inverse operations of the NTT for the CRYSTALS-Kyber algorithm. Moreover, a unified design (Unif-BU) of the Cooley-Tukey and Gentleman-Sande butterflies is presented using two adders, multipliers, subtractors, routing multiplexers and barret-based modular reduction units. Finally, a dedicated controller is implemented for efficient control functionalities. The implementation results are realized on field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) platforms. The Unif-NTT requires 1664 and 1792 clock cycles for one forward and inverse NTT computations, respectively. It can operate up to a maximum frequency of <inline-formula> <tex-math notation="LaTeX">212MHz </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">2.5GHz </tex-math></inline-formula> over Virtex-7 FPGA and 28nm ASIC platforms, respectively. The Unif-NTT is 26% more efficient in Area-Time-Product compared to the most area-optimized NTT accelerator from the state-of-the-art. The Unif-NTT design is suited for applications that demand reasonable hardware resources with processing speed.]]></description><subject>accelerator</subject><subject>Adders</subject><subject>ASIC</subject><subject>Clocks</subject><subject>Computer architecture</subject><subject>Cryptography</subject><subject>Field programmable gate arrays</subject><subject>FPGA</subject><subject>hardware</subject><subject>Hardware acceleration</subject><subject>Number theoretic transform</subject><subject>Polynomials</subject><subject>post-quantum cryptography</subject><subject>Signal processing algorithms</subject><subject>Throughput</subject><issn>2169-3536</issn><issn>2169-3536</issn><fulltext>true</fulltext><rsrctype>article</rsrctype><creationdate>2024</creationdate><recordtype>article</recordtype><sourceid>ESBDL</sourceid><sourceid>DOA</sourceid><recordid>eNpNkF9LwzAUxYsoOOY-gT7kC3TmX5vEt1I3Nxjq2PYc0vRmdmytJEPx25vZIbsv93DgHA6_JLkneEwIVo9FWU5WqzHFlI8Zp5kk7CoZUJKrlGUsv77Qt8kohB2OJ6OViUGy3LSNS1_X6ydUoJNuoEYz4-tv4wE9Q2i2LeocmnY-OjUybY3m7Rf4ACimkOs8el-WqNhvO98cPw7hLrlxZh9gdP7DZDOdrMtZunh7mZfFIrVxzjHNVGWtZVbEuUIJxVVVVZbLXAhibJ5xkDQ3BpxinNWYEi5sjkkueGUkVJINk3nfW3dmpz99czD-R3em0X9G57fa-GNj96AVExKMk7TmnKvMVlQqhy3IjIAACrGL9V3WdyF4cP99BOsTZN1D1ifI-gw5ph76VAMAF4lMESUo-wVt1nYy</recordid><startdate>2024</startdate><enddate>2024</enddate><creator>Yahya Hummdi, Ali</creator><creator>Aljaedi, Amer</creator><creator>Bassfar, Zaid</creator><creator>Shaukat Jamal, Sajjad</creator><creator>Mazyad Hazzazi, Mohammad</creator><creator>Rehman, Mujeeb Ur</creator><general>IEEE</general><scope>97E</scope><scope>ESBDL</scope><scope>RIA</scope><scope>RIE</scope><scope>AAYXX</scope><scope>CITATION</scope><scope>DOA</scope><orcidid>https://orcid.org/0000-0003-4099-5025</orcidid><orcidid>https://orcid.org/0000-0002-7945-9994</orcidid><orcidid>https://orcid.org/0000-0003-1172-885X</orcidid><orcidid>https://orcid.org/0000-0002-5852-1955</orcidid><orcidid>https://orcid.org/0009-0006-9570-4714</orcidid><orcidid>https://orcid.org/0000-0002-8154-6560</orcidid></search><sort><creationdate>2024</creationdate><title>Unif-NTT: A Unified Hardware Design of Forward and Inverse NTT for PQC Algorithms</title><author>Yahya Hummdi, Ali ; Aljaedi, Amer ; Bassfar, Zaid ; Shaukat Jamal, Sajjad ; Mazyad Hazzazi, Mohammad ; Rehman, Mujeeb Ur</author></sort><facets><frbrtype>5</frbrtype><frbrgroupid>cdi_FETCH-LOGICAL-c216t-59bccc3c7813797949bbbc486771ac654e826aaef9343d02147c601674ba8eb83</frbrgroupid><rsrctype>articles</rsrctype><prefilter>articles</prefilter><language>eng</language><creationdate>2024</creationdate><topic>accelerator</topic><topic>Adders</topic><topic>ASIC</topic><topic>Clocks</topic><topic>Computer architecture</topic><topic>Cryptography</topic><topic>Field programmable gate arrays</topic><topic>FPGA</topic><topic>hardware</topic><topic>Hardware acceleration</topic><topic>Number theoretic transform</topic><topic>Polynomials</topic><topic>post-quantum cryptography</topic><topic>Signal processing algorithms</topic><topic>Throughput</topic><toplevel>peer_reviewed</toplevel><toplevel>online_resources</toplevel><creatorcontrib>Yahya Hummdi, Ali</creatorcontrib><creatorcontrib>Aljaedi, Amer</creatorcontrib><creatorcontrib>Bassfar, Zaid</creatorcontrib><creatorcontrib>Shaukat Jamal, Sajjad</creatorcontrib><creatorcontrib>Mazyad Hazzazi, Mohammad</creatorcontrib><creatorcontrib>Rehman, Mujeeb Ur</creatorcontrib><collection>IEEE All-Society Periodicals Package (ASPP) 2005–Present</collection><collection>IEEE Xplore Open Access Journals</collection><collection>IEEE All-Society Periodicals Package (ASPP) 1998–Present</collection><collection>IEEE Xplore / Electronic Library Online (IEL)</collection><collection>CrossRef</collection><collection>DOAJ Directory of Open Access Journals</collection><jtitle>IEEE access</jtitle></facets><delivery><delcategory>Remote Search Resource</delcategory><fulltext>fulltext</fulltext></delivery><addata><au>Yahya Hummdi, Ali</au><au>Aljaedi, Amer</au><au>Bassfar, Zaid</au><au>Shaukat Jamal, Sajjad</au><au>Mazyad Hazzazi, Mohammad</au><au>Rehman, Mujeeb Ur</au><format>journal</format><genre>article</genre><ristype>JOUR</ristype><atitle>Unif-NTT: A Unified Hardware Design of Forward and Inverse NTT for PQC Algorithms</atitle><jtitle>IEEE access</jtitle><stitle>Access</stitle><date>2024</date><risdate>2024</risdate><volume>12</volume><spage>94793</spage><epage>94804</epage><pages>94793-94804</pages><issn>2169-3536</issn><eissn>2169-3536</eissn><coden>IAECCG</coden><abstract><![CDATA[Polynomial multiplications based on the number theoretic transform (NTT) are critical in lattice-based post-quantum cryptography algorithms. Therefore, this paper presents a platform-agnostic unified hardware accelerator design (Unif-NTT) to compute the forward and inverse operations of the NTT for the CRYSTALS-Kyber algorithm. Moreover, a unified design (Unif-BU) of the Cooley-Tukey and Gentleman-Sande butterflies is presented using two adders, multipliers, subtractors, routing multiplexers and barret-based modular reduction units. Finally, a dedicated controller is implemented for efficient control functionalities. The implementation results are realized on field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) platforms. The Unif-NTT requires 1664 and 1792 clock cycles for one forward and inverse NTT computations, respectively. It can operate up to a maximum frequency of <inline-formula> <tex-math notation="LaTeX">212MHz </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">2.5GHz </tex-math></inline-formula> over Virtex-7 FPGA and 28nm ASIC platforms, respectively. The Unif-NTT is 26% more efficient in Area-Time-Product compared to the most area-optimized NTT accelerator from the state-of-the-art. The Unif-NTT design is suited for applications that demand reasonable hardware resources with processing speed.]]></abstract><pub>IEEE</pub><doi>10.1109/ACCESS.2024.3425813</doi><tpages>12</tpages><orcidid>https://orcid.org/0000-0003-4099-5025</orcidid><orcidid>https://orcid.org/0000-0002-7945-9994</orcidid><orcidid>https://orcid.org/0000-0003-1172-885X</orcidid><orcidid>https://orcid.org/0000-0002-5852-1955</orcidid><orcidid>https://orcid.org/0009-0006-9570-4714</orcidid><orcidid>https://orcid.org/0000-0002-8154-6560</orcidid><oa>free_for_read</oa></addata></record>
fulltext fulltext
identifier ISSN: 2169-3536
ispartof IEEE access, 2024, Vol.12, p.94793-94804
issn 2169-3536
2169-3536
language eng
recordid cdi_crossref_primary_10_1109_ACCESS_2024_3425813
source IEEE Xplore Open Access Journals
subjects accelerator
Adders
ASIC
Clocks
Computer architecture
Cryptography
Field programmable gate arrays
FPGA
hardware
Hardware acceleration
Number theoretic transform
Polynomials
post-quantum cryptography
Signal processing algorithms
Throughput
title Unif-NTT: A Unified Hardware Design of Forward and Inverse NTT for PQC Algorithms
url http://sfxeu10.hosted.exlibrisgroup.com/loughborough?ctx_ver=Z39.88-2004&ctx_enc=info:ofi/enc:UTF-8&ctx_tim=2025-01-10T09%3A49%3A41IST&url_ver=Z39.88-2004&url_ctx_fmt=infofi/fmt:kev:mtx:ctx&rfr_id=info:sid/primo.exlibrisgroup.com:primo3-Article-doaj_cross&rft_val_fmt=info:ofi/fmt:kev:mtx:journal&rft.genre=article&rft.atitle=Unif-NTT:%20A%20Unified%20Hardware%20Design%20of%20Forward%20and%20Inverse%20NTT%20for%20PQC%20Algorithms&rft.jtitle=IEEE%20access&rft.au=Yahya%20Hummdi,%20Ali&rft.date=2024&rft.volume=12&rft.spage=94793&rft.epage=94804&rft.pages=94793-94804&rft.issn=2169-3536&rft.eissn=2169-3536&rft.coden=IAECCG&rft_id=info:doi/10.1109/ACCESS.2024.3425813&rft_dat=%3Cdoaj_cross%3Eoai_doaj_org_article_9378eaf82d44495cb289f0ce851e7e2e%3C/doaj_cross%3E%3Cgrp_id%3Ecdi_FETCH-LOGICAL-c216t-59bccc3c7813797949bbbc486771ac654e826aaef9343d02147c601674ba8eb83%3C/grp_id%3E%3Coa%3E%3C/oa%3E%3Curl%3E%3C/url%3E&rft_id=info:oai/&rft_id=info:pmid/&rft_ieee_id=10591972&rfr_iscdi=true