WayToFFT Part 2
Way To NTT……
Note
The Number Theoretic Transform (NTT) is a powerful mathematical tool that has become increasingly important in developing Post Quantum Cryptography (PQC) and Homomorphic Encryption (HE).
Review
Maybe those are all you need link1,link2.
Notation
- Textbook-style Convolution
- linear convolution
- cyclic convolution
- Negacyclic Convolution
Review - Linear Convolution
Converting Coefficients Multiplication into Vector Multiplication.
Polynomial multiplication:
Having $G(x)$ and $H(x)$ as polynomials of degree $n-1$ in the ring $\mathbb{Z_q} $$x$$$ where q $\in \mathbb{Z}$ and $x$ is the polynomial variable. Generally speaking, a Polynomial Multiplication of $G(x)$ and $H(x)$ is defined as: $$Y(x) = G(x)\cdot H(x) = \sum_{k=0}^{2(n-1)}y_kx^k$$ where $y_k = \sum_{i=0}^{k} g_i h_{k-i} \mod q$, $g$ and $h$ are the polynomial coefficients of $G(x)$ and $H(x)$ respectively.
About $y_k = \sum_{i=0}^{k} g_i h_{k-i} \mod q$
Simply This is merely a symbolic expression used to describe the final result.
Think about what happens when you manually multiply two polynomials, $G(x)$ and $H(x)$. How do you get the final term for $x^k$?
You get it by finding every pair of terms—one from $G(x)$ and one from $H(x)$—whose exponents add up to exactly $k$.
Specifically, to get an $x^k$ term, you must multiply a term like $g_ix^i$ from G(x) with a term like $h_jx^j$ from $H(x)$ where $i+j=k$.
The formula is simply a systematic way of finding and adding up all these pairs. $$ y_k = \sum_{i=0}^{k} g_i h_{k-i} $$ This formula iterates through all the possibilities:
- When $i=0$, it pairs the constant term from $G(x)$ (coefficient $g_{0}$) with the $x^k$ term from $H(x)$ (coefficient $h^k$). This gives you the product $g_{0}h^{k}$.
- When $i=1$, it pairs the $x^1$ term from $G(x)$ (coefficient $g_{1}$) with the $x^{k−1}$ term from $H(x)$ (coefficient $h^{k−1}$). This gives you $g_{1}h^{k−1}$.
- …and so on.
- This continues until $i=k$, pairing the $x^{k}$ term from $G(x)$ (coefficient $g_{k}$) with the constant term from $H(x)$ (coefficient $h_{0}$), which gives you $g_{k}h^{0}$.
The final coefficient, $y_k$, is just the sum of all these products. The formula is the compact, formal way of writing this exact process, which is also known as a discrete convolution.
So, just import some notation about coefficient vector. We can convert Coefficients Multiplication into Vector Multiplication.
$$ y[k]= \boldsymbol{g} \star \boldsymbol{h}[k] = \sum_{i=0}^{k}g[i]h[k-i] $$
Of course. The formula you’ve shown is a discrete linear convolution. While it’s not a simple one-to-one vector multiplication, it can be understood as a form of vector multiplication in three main ways.
Understood as a Series of Dot Products
We can think of the calculation of each output element y[k]
as a single vector dot product.
The formula is: $y[k] = \sum_{i=0}^{k} g[i]h[k-i] = g[0]h[k] + g[1]h[k-1] + \dots + g[k]h[0]$
To calculate y[k]
, we need two vectors:
- The first
k+1
elements of vectorg
:[g[0], g[1], ..., g[k]]
- The first
k+1
elements of vectorh
, but reversed:[h[k], h[k-1], ..., h[0]]
Then, y[k]
is simply the dot product of these two vectors.
Intuition:
This is known as the “Flip and Slide” method. Imagine flipping the vector h
and then sliding it across the vector g
one step at a time. At each step, the dot product of the overlapping parts is calculated, which gives one element of the resulting vector y
.
Understood as Matrix-Vector Multiplication
This is the most direct algebraic representation. We can expand one of the vectors (e.g., g
) into a special matrix called a Toeplitz Matrix or Convolution Matrix. The entire convolution operation then becomes a single matrix-vector multiplication.
Let’s assume g = [g₀, g₁, g₂]
and h = [h₀, h₁, h₂]
.
The result of the linear convolution, y
, will have a length of (3+3-1) = 5. We can construct the following matrix-vector multiplication:
$$ \begin{pmatrix} y_0 \\ y_1 \\ y_2 \\ y_3 \\ y_4 \end{pmatrix} = \begin{pmatrix} g_0 & 0 & 0 \\ g_1 & g_0 & 0 \\ g_2 & g_1 & g_0 \\ 0 & g_2 & g_1 \\ 0 & 0 & g_2 \end{pmatrix} \begin{pmatrix} h_0 \\ h_1 \\ h_2 \end{pmatrix} $$
If you work out the matrix multiplication, you will find that the results match the convolution definition exactly:
- $y_0 = g_0 h_0$
- $y_1 = g_1 h_0 + g_0 h_1$
- $y_2 = g_2 h_0 + g_1 h_1 + g_0 h_2$ …and so on.
This method packages the entire convolution process into a single linear transformation, which is the clearest “vector multiplication” representation from a linear algebra perspective.
Review - Cyclic Convolution
Positive Wrapped Convolution as $PWC(x)$
Suppose that $G(x)$ and $H(x)$ are polynomials of degree $n-1$ in the quotient ring $\mathbb{Z_{q}}[x]/(x^{n}-1)$ where $q \in \mathbb{Z}$. A Cyclic Convolution or positive wrapped convolution is defined as: $$PWC(x) = \sum_{k=0}^{n-1} c_{k}x^{k}$$ where $c_{k}= \sum_{i=0}^{k}g_{i}h_{k-i} + \sum_{i=k+1}^{n-1} g_{i}h_{k+n-i}$. If $Y(x)$ is the result of their linear convolution in the ring $\mathbb{Z}_{q}[x]$, it also can be defined as: $$PWC(x) = Y(x) \mod (x^{n} - 1)$$
Review - Negacyclic Convolution
Definition 2.3. Suppose that $G(x)$ and $H(x)$ are polynomials of degree $n-1$ in the quotient ring $\mathbb{Z}[x]/(x^{n}+1)$ where $q \in \mathbb{Z}$. A negacyclic convolution or negative wrapped convolution, $\operatorname{NWC}(x)$ is defined as:
$$ \operatorname{NWC}(x)=\sum_{k=0}^{n-1}c_{k}x^{k} $$ where $c_{k}=\sum_{i=0}^{k}g_{i}h_{k-i}-\sum_{i=k+1}^{n-1}g_{i}h_{k+n-i} \mod q$. If $Y(x)$ is the result of their linear convolution in the ring $\mathbb{Z}[x]$, it also can be defined as $$ \operatorname{NWC}(x)=Y(x) \bmod (x^{n}+1)$$
Key Observation
The above is a brief description of polynomial multiplication.
Review - NTT-Based Convolution.
Number Theoretic Transform Based on $\omega$
Review - Fast NTT
3. NTT-Based Convolutions
3.1 Primitive n-th Root of Unity
Let $\mathbb{Z}_q$ be the integer ring modulo $q$.
An element $\omega \in \mathbb{Z}_q$ is called a primitive $n$-th root of unity if:
$$ \omega^n \equiv 1 \pmod{q} $$ and $$ \omega^k \not\equiv 1 \pmod{q} \quad \text{for all } k < n. $$
Example (Kyber-style parameters, but small $n$ for illustration):
In $\mathbb{Z}_{7681}$, for $n = 4$, the 4-th roots of unity are ${3383, 4298, 7680}$.
Only $3383$ and $4298$ are primitive.
These $\omega$ values are the “twiddle factors” for NTT.
3.2 NTT for Positive-Wrapped (Cyclic) Convolution
The Number Theoretic Transform of $a = [a_0, \dots, a_{n-1}]$ is:
$$ \hat{a_j} = \sum_{i=0}^{n-1} \omega^{ij} a_i \pmod{q}, \quad j = 0,\dots, n-1 $$
This is DFT in the modular world.
Inverse NTT (INTT) replaces $\omega$ with $\omega^{-1}$ and multiplies by $n^{-1} \pmod{q}$.
NTT Convolution Theorem
For $a, b \in \mathbb{Z}_q^n$:
$$ c = \operatorname{INTT}(\operatorname{NTT}(a) \circ \operatorname{NTT}(b)) $$
where $\circ` = elementwise multiplication.
Example:
Using $n=4$, $q=7681$, $\omega=3383$,
$g = [1, 2, 3, 4]$, $h = [5, 6, 7, 8]$
⇒ NTT(g) = [10, 913, 7679, 6764],
NTT(h) = [26, 913, 7679, 6764]
⇒ elementwise multiply, INTT → cyclic convolution [66, 68, 66, 60].
3.3 NTT for Negative-Wrapped (Negacyclic) Convolution
For rings $\mathbb{Z}_q[x]/(x^n + 1)$, we need the primitive $2n$-th root of unity $\psi$:
$$ \psi^2 \equiv \omega \pmod{q}, \quad \psi^n \equiv -1 \pmod{q} $$
The negative-wrapped NTT is:
$$ \hat{a_j} = \sum_{i=0}^{n-1} \psi^{2ij + i} a_i \pmod{q} $$
Inverse uses $\psi^{-1}$ and $n^{-1}$.
Negacyclic Convolution Theorem
Example:
$n=4$, $q=7681$, $\omega=3383$, choose $\psi=1925$.
$\text{NTT}\psi(g) = [1467, 2807, 3471, 7621]$
$\text{NTT}\psi(h) = [2489, 7489, 6478, 6607]$
Multiply elementwise, apply $\text{INTT}_\psi$ → [−56, −36, 2, 60].
3.4 Choosing the Modulus $q$
For PWC-NTT (cyclic convolution): $n \mid (q-1)$.
For NWC-NTT (negacyclic convolution): $2n \mid (q-1)$.
Example PQC parameters:
Scheme n q PWC? NWC? Kyber V3 256 3329 ✔ ✘ Dilithium 256 8380417 ✔ ✔
4. Fast NTT: FFT-Style Acceleration
Direct NTT = $O(n^2)$ complexity.
FFT-style divide and conquer gives $O(n\log n)$.
4.1 Cooley–Tukey (CT) Butterfly for NTT
Splits even/odd indices:
$$ \hat{a_j} = A_j + \psi^{2j+1} B_j $$ $$ \hat{a_{j+n/2}} = A_j - \psi^{2j+1} B_j $$
Requires inputs in Normal Order (NO), outputs in Bit-Reversed Order (BO).
4.2 Gentleman–Sande (GS) Butterfly for INTT
Splits top/bottom halves:
$$ a_{2i} = (A_i + B_i)\psi^{-2i} $$ $$ a_{2i+1} = (A_i - B_i)\psi^{-2i} $$
Takes BO input, outputs NO.
4.3 Polynomial Multiplication via CT + GS
- CT butterfly for NTT(a), NTT(b)
- Elementwise multiply results
- GS butterfly for INTT
This reduces modular polynomial multiplication from $O(n^2)$ → $O(n\log n)$.
4.4 Normal Order (NO) and Bit-Reversed Order (BO)
- NO: Standard indexing [0,1,2,…,n-1]
- BO: Reverse the binary representation of the index.
Example $n=4$:
NO = [0, 1, 2, 3]
BO = [0, 2, 1, 3]
CT: NO → BO
GS: BO → NO
- NTT ≈ modular DFT for polynomials
- $\omega$ → primitive $n$-th root of unity (cyclic conv)
- $\psi$ → primitive $2n$-th root (negacyclic conv)
- CT butterfly for forward transform, GS butterfly for inverse
- NO/BO ordering is essential for implementation
- $O(n\log n)$ complexity makes it usable in PQC lattice schemes
summary
FFT-style NTT algorithm or Fast-NTT is particularly useful in lattice-based cryptography.
:)