LogUp-GKR proposition to simplify input layer polynomials #1417

plafer · 2024-07-28T12:28:01Z

I believe we can simplify proving the input layer by getting rid of FinalLayerProof.before_merge_proof, following an argument analogous to how Thaler simplified the layer polynomial definition from the original GKR paper in this note.

This simplification would simplify the code for the input layer, as well as reduce proving time (since we run $\nu - 1$ fewer sum-check rounds).

Notation

I think there's a bug with Github Latex rendering where I can't render braces {}. So 0,1^v will stand for {0,1}^v.

We will use the following notation, which is very similar to what was used in other issues. It is probably easier to skip to the next section, and only refer back to specific parts of this list when needed.

there are $2^\nu$ terms in the GKR Sum (e.g. there are currently 8 terms with the range checker only),
the trace is of size $2^n$,
The MLE of the trace columns are denoted $\tilde{f}_1(\hat{x}), \dots, \tilde{f}_c(\hat{x})$,
numerator merge polynomial: $g_{[z]}(\tilde{f}_1(\hat{x}), \dots, \tilde{f}_c(\hat{x}))$, $\quad z \in {0,1}^\nu$ and $\hat{x} \in \mathbb{F}^n$
- e.g. in the case of the range checker, $g_{[0,0,0]}$ is the multiplicity polynomial, $h_{[1,0,0]}$ is the memory flag polynomial, and so on,
denominator merge polynomial: $h_{[z]}(\tilde{f}_1(\hat{x}), \dots, \tilde{f}_c(\hat{x}))$, $\quad z \in {0,1}^\nu$ and $\hat{x} \in \mathbb{F}^n$
- e.g. in the case of the range checker, $h_{[0,0,0]}$ is the table denominator, $g_{[1,0,0]}$ is the first memory denominator, and so on.

In LogUp-GKR, each layer is interpolated by 2 polynomials. We will denote the input layer polynomials as

$p_d(\hat{z}, \hat{x}) = \sum_{z \in {0,1}^\nu} eq(z, \hat{z}) g_{[z]}(\tilde{f}_1(\hat{x}), \dots, \tilde{f}_c(\hat{x}))$,
$q_d(\hat{z}, \hat{x}) = \sum_{z \in {0,1}^\nu} eq(z, \hat{z}) h_{[z]}(\tilde{f}_1(\hat{x}), \dots, \tilde{f}_c(\hat{x}))$.

Similarly, $p_{d-1}$ and $q_{d-1}$ will be used to refer

Also, given any function $\psi(x, y)$, we define $\psi_{\hat{x}}(y) = \psi(\hat{x}, y)$. In words, this is the notation we'll use to bind a variable $x$ to a fixed value $\hat{x}$.

Lastly, we will automatically "destructure" variables. Specifically, if $x = (x_0, x_1)$, then $\psi(x) = \psi(x_0, x_1)$.

Current treatment of input layer

Let $w \in {0,1}^{\nu - 1}$.The input layer is currently defined as

$$ (p_{d-1} + \lambda q_{d-1})(\hat{w}, \hat{x}) = \sum_{w \in {0,1}^{\nu - 1}} \sum_{x \in {0,1}^n} eq(w,\hat{w}) eq(x, \hat{x}) [p_d(0, w, x)q_d(1, w, x) + p_d(1, w, x)q_d(0, w, x) + \lambda q_d(0, w, x) q_d(1, w, x)] $$

We run a sum-check over $w$ and $x$ in 2 phases (using GkrComposition and GkrCompositionMerge, respectively). The first phase's result is stored in FinalLayerProof.before_merge_proof, while the second phase's result is stored in FinalLayerProof.after_merge_proof.

The first sum-check is over $w$ (i.e. $x$ is treated as a constant). Once we completed the $\nu - 1$ rounds, we will have sampled $\nu-1$ random elements that we'll call $\rho_w$. This corresponds to merge_randomness in GkrCompositionMerge::new.

Then, the second sum-check is over the remaining $x$ variables, and uses $\rho_w$ instead of $\hat{w}$. It is defined as:

$$ \sum_{x \in {0,1}^n} eq(x, \hat{x}) [\tilde{p}_d(0, \rho_w, x)\tilde{q}_d(1, \rho_w, x) + \tilde{p}_d(1, \rho_w, x)\tilde{q}_d(0, \rho_w, x) + \lambda \tilde{q}_d(0, \rho_w, x) \tilde{q}_d(1, \rho_w, x)] $$

where we abuse the $\tilde{p}_d$ and $\tilde{q}_d$ notation to mean:

$$ \begin{align} \tilde{p}_d(0, \rho_w, x) &= \sum_w eq(w, \rho_w) p_d(0, w, x) \\ \tilde{q}_d(0, \rho_w, x) &= \sum_w eq(w, \rho_w) q_d(0, w, x) \end{align} $$

Simplification of input layer

I believe we can forgo the first sum-check over $w$, and define our input layer sum-check as

$$ (p_{d-1} + \lambda q_{d-1})(\hat{w}, \hat{x}) = \sum_{x \in {0,1}^n} eq(x, \hat{x}) [\tilde{p}_d(0, \hat{w}, x)\tilde{q}_d(1, \hat{w}, x) + \tilde{p}_d(1, \hat{w}, x)\tilde{q}_d(0, \hat{w}, x) + \lambda \tilde{q}_d(0, \hat{w}, x) \tilde{q}_d(1, \hat{w}, x)], $$

where we reuse the same abuse of notation for $\tilde{p}_d$ and $\tilde{q}_d$ from the previous section. In practice this reduces to only running the FinalLayerProof.after_merge_proof part of the input layer, and use $\hat{w}$ as merge_randomness argument to GkrCompositionMerge::new.

Formally, this reuses the same idea from Thaler, specifically equations 3 and 4, and subsequent proof. Basically, to show that the right-hand side is an equivalent definition of $(p_{d-1} + \lambda q_{d-1})(\hat{w}, \hat{x})$, we only need to show that our definition is a proper MLE of the input layer data; that is,

properly interpolates the input layer data, and
is multilinear.

This, along with the fact that the MLE of a function is unique, completes the proof. I believe the same proof (after equation 4) from Thaler's note applies here, so I will leave it out.

The text was updated successfully, but these errors were encountered:

Al-Kindi-0 · 2024-08-08T09:25:45Z

The argument needs linearity in order for it to go through, but the expression proposed in the simplification is not linear in $\hat{w}$ as far as I can see., there is a product between $\hat{p}_d$ and $\hat{q}_d$ and both are functions of $\hat{w}$.

plafer · 2024-08-08T14:37:54Z

Ah yes, good catch, Thaler's simplification indeed doesn't apply in this case.

Al-Kindi-0 · 2024-08-15T14:24:39Z

(I am re-opening based on an offline discussion.)

Indeed, Thaler's idea does not seem to apply here but nevertheless we can simplify the current treatment of the input layer by making the following observation. The equation linking the input layer to its successor can be expressed as follows:

$$ p_{n-1}\left(v_2, \cdots, v_m, x_1, \cdots, x_m\right) = \sum_{y\in\mathbb{B}_m} G ( y ) $$

and

$$ q_{n-1}\left(v_2, \cdots, v_m, x_1, \cdots, x_n\right) = \sum_{y\in\mathbb{B}_n} H\left(y \right) $$

where $\mathbb{B}_n$ is the boolean hyper-cube of size $n$,

$$ G := \left( \left(y_1, \cdots, y_n\right) \longrightarrow EQ\left(\left(x_1, \cdots, x_n\right), \left(y_1, \cdots, y_n\right)\right) \left( \sum_{w_i} EQ\left(\left(v_2, \cdots, v_m\right), \left(w_2, \cdots, w_m\right)\right) \cdot \left( p_n\left(1, w_2, \cdots, w_m, y_1, \cdots, y_n\right) \cdot q_n\left(0, w_2, \cdots, w_m, y_1, \cdots, y_n\right) + p_n\left(0, w_2, \cdots, w_m, y_1, \cdots, y_n\right) \cdot q_n\left(1, w_2, \cdots, w_m, y_1, \cdots, y_n\right)\right) \right)\right) $$

and

$$ H := \left( \left(y_1, \cdots, y_n\right) \longrightarrow EQ\left(\left(x_1, \cdots, x_n\right), \left(y_1, \cdots, y_n\right)\right) \left( \sum_{w_i} EQ\left(\left(v_2, \cdots, v_m\right)\right) \cdot q_n\left(1, w_2, \cdots, w_m, y_1, \cdots, y_n\right) \cdot q_n\left(0, w_2, \cdots, w_m, y_1, \cdots, y_n\right) \right)\right). $$

From a performance standpoint, the tree depth stays the same but we avoid running the sum-check of degree 3 for $m - 1$ steps and simplify the treatment of the input layer, which is a big plus.

plafer · 2024-08-21T18:35:16Z

Implemented in facebook/winterfell#295

plafer closed this as not planned Won't fix, can't repro, duplicate, stale Aug 8, 2024

Al-Kindi-0 reopened this Aug 15, 2024

plafer closed this as completed Aug 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LogUp-GKR proposition to simplify input layer polynomials #1417

LogUp-GKR proposition to simplify input layer polynomials #1417

plafer commented Jul 28, 2024

Al-Kindi-0 commented Aug 8, 2024

plafer commented Aug 8, 2024

Al-Kindi-0 commented Aug 15, 2024

plafer commented Aug 21, 2024

LogUp-GKR proposition to simplify input layer polynomials #1417

LogUp-GKR proposition to simplify input layer polynomials #1417

Comments

plafer commented Jul 28, 2024

Notation

Current treatment of input layer

Simplification of input layer

Al-Kindi-0 commented Aug 8, 2024

plafer commented Aug 8, 2024

Al-Kindi-0 commented Aug 15, 2024

plafer commented Aug 21, 2024