Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LogUp-GKR proposition to simplify input layer polynomials #1417

Closed
plafer opened this issue Jul 28, 2024 · 4 comments
Closed

LogUp-GKR proposition to simplify input layer polynomials #1417

plafer opened this issue Jul 28, 2024 · 4 comments

Comments

@plafer
Copy link
Contributor

plafer commented Jul 28, 2024

I believe we can simplify proving the input layer by getting rid of FinalLayerProof.before_merge_proof, following an argument analogous to how Thaler simplified the layer polynomial definition from the original GKR paper in this note.

This simplification would simplify the code for the input layer, as well as reduce proving time (since we run $\nu - 1$ fewer sum-check rounds).

Notation

I think there's a bug with Github Latex rendering where I can't render braces {}. So 0,1^v will stand for {0,1}^v.

We will use the following notation, which is very similar to what was used in other issues. It is probably easier to skip to the next section, and only refer back to specific parts of this list when needed.

  • there are $2^\nu$ terms in the GKR Sum (e.g. there are currently 8 terms with the range checker only),
  • the trace is of size $2^n$,
  • The MLE of the trace columns are denoted $\tilde{f}_1(\hat{x}), \dots, \tilde{f}_c(\hat{x})$,
  • numerator merge polynomial: $g_{[z]}(\tilde{f}_1(\hat{x}), \dots, \tilde{f}_c(\hat{x}))$, $\quad z \in {0,1}^\nu$ and $\hat{x} \in \mathbb{F}^n$
  • denominator merge polynomial: $h_{[z]}(\tilde{f}_1(\hat{x}), \dots, \tilde{f}_c(\hat{x}))$, $\quad z \in {0,1}^\nu$ and $\hat{x} \in \mathbb{F}^n$

In LogUp-GKR, each layer is interpolated by 2 polynomials. We will denote the input layer polynomials as

  • $p_d(\hat{z}, \hat{x}) = \sum_{z \in {0,1}^\nu} eq(z, \hat{z}) g_{[z]}(\tilde{f}_1(\hat{x}), \dots, \tilde{f}_c(\hat{x}))$,
  • $q_d(\hat{z}, \hat{x}) = \sum_{z \in {0,1}^\nu} eq(z, \hat{z}) h_{[z]}(\tilde{f}_1(\hat{x}), \dots, \tilde{f}_c(\hat{x}))$.

Similarly, $p_{d-1}$ and $q_{d-1}$ will be used to refer

Also, given any function $\psi(x, y)$, we define $\psi_{\hat{x}}(y) = \psi(\hat{x}, y)$. In words, this is the notation we'll use to bind a variable $x$ to a fixed value $\hat{x}$.

Lastly, we will automatically "destructure" variables. Specifically, if $x = (x_0, x_1)$, then $\psi(x) = \psi(x_0, x_1)$.

Current treatment of input layer

Let $w \in {0,1}^{\nu - 1}$.The input layer is currently defined as

$$ (p_{d-1} + \lambda q_{d-1})(\hat{w}, \hat{x}) = \sum_{w \in {0,1}^{\nu - 1}} \sum_{x \in {0,1}^n} eq(w,\hat{w}) eq(x, \hat{x}) [p_d(0, w, x)q_d(1, w, x) + p_d(1, w, x)q_d(0, w, x) + \lambda q_d(0, w, x) q_d(1, w, x)] $$

We run a sum-check over $w$ and $x$ in 2 phases (using GkrComposition and GkrCompositionMerge, respectively). The first phase's result is stored in FinalLayerProof.before_merge_proof, while the second phase's result is stored in FinalLayerProof.after_merge_proof.

The first sum-check is over $w$ (i.e. $x$ is treated as a constant). Once we completed the $\nu - 1$ rounds, we will have sampled $\nu-1$ random elements that we'll call $\rho_w$. This corresponds to merge_randomness in GkrCompositionMerge::new.

Then, the second sum-check is over the remaining $x$ variables, and uses $\rho_w$ instead of $\hat{w}$. It is defined as:

$$ \sum_{x \in {0,1}^n} eq(x, \hat{x}) [\tilde{p}_d(0, \rho_w, x)\tilde{q}_d(1, \rho_w, x) + \tilde{p}_d(1, \rho_w, x)\tilde{q}_d(0, \rho_w, x) + \lambda \tilde{q}_d(0, \rho_w, x) \tilde{q}_d(1, \rho_w, x)] $$

where we abuse the $\tilde{p}_d$ and $\tilde{q}_d$ notation to mean:

$$ \begin{align} \tilde{p}_d(0, \rho_w, x) &= \sum_w eq(w, \rho_w) p_d(0, w, x) \\ \tilde{q}_d(0, \rho_w, x) &= \sum_w eq(w, \rho_w) q_d(0, w, x) \end{align} $$

Simplification of input layer

I believe we can forgo the first sum-check over $w$, and define our input layer sum-check as

$$ (p_{d-1} + \lambda q_{d-1})(\hat{w}, \hat{x}) = \sum_{x \in {0,1}^n} eq(x, \hat{x}) [\tilde{p}_d(0, \hat{w}, x)\tilde{q}_d(1, \hat{w}, x) + \tilde{p}_d(1, \hat{w}, x)\tilde{q}_d(0, \hat{w}, x) + \lambda \tilde{q}_d(0, \hat{w}, x) \tilde{q}_d(1, \hat{w}, x)], $$

where we reuse the same abuse of notation for $\tilde{p}_d$ and $\tilde{q}_d$ from the previous section. In practice this reduces to only running the FinalLayerProof.after_merge_proof part of the input layer, and use $\hat{w}$ as merge_randomness argument to GkrCompositionMerge::new.

Formally, this reuses the same idea from Thaler, specifically equations 3 and 4, and subsequent proof. Basically, to show that the right-hand side is an equivalent definition of $(p_{d-1} + \lambda q_{d-1})(\hat{w}, \hat{x})$, we only need to show that our definition is a proper MLE of the input layer data; that is,

  1. properly interpolates the input layer data, and
  2. is multilinear.

This, along with the fact that the MLE of a function is unique, completes the proof. I believe the same proof (after equation 4) from Thaler's note applies here, so I will leave it out.

@Al-Kindi-0
Copy link
Collaborator

The argument needs linearity in order for it to go through, but the expression proposed in the simplification is not linear in $\hat{w}$ as far as I can see., there is a product between $\hat{p}_d$ and $\hat{q}_d$ and both are functions of $\hat{w}$.

@plafer
Copy link
Contributor Author

plafer commented Aug 8, 2024

Ah yes, good catch, Thaler's simplification indeed doesn't apply in this case.

@plafer plafer closed this as not planned Won't fix, can't repro, duplicate, stale Aug 8, 2024
@Al-Kindi-0
Copy link
Collaborator

(I am re-opening based on an offline discussion.)

Indeed, Thaler's idea does not seem to apply here but nevertheless we can simplify the current treatment of the input layer by making the following observation. The equation linking the input layer to its successor can be expressed as follows:

$$ p_{n-1}\left(v_2, \cdots, v_m, x_1, \cdots, x_m\right) = \sum_{y\in\mathbb{B}_m} G ( y ) $$

and

$$ q_{n-1}\left(v_2, \cdots, v_m, x_1, \cdots, x_n\right) = \sum_{y\in\mathbb{B}_n} H\left(y \right) $$

where $\mathbb{B}_n$ is the boolean hyper-cube of size $n$,

$$ G := \left( \left(y_1, \cdots, y_n\right) \longrightarrow EQ\left(\left(x_1, \cdots, x_n\right), \left(y_1, \cdots, y_n\right)\right) \left( \sum_{w_i} EQ\left(\left(v_2, \cdots, v_m\right), \left(w_2, \cdots, w_m\right)\right) \cdot \left( p_n\left(1, w_2, \cdots, w_m, y_1, \cdots, y_n\right) \cdot q_n\left(0, w_2, \cdots, w_m, y_1, \cdots, y_n\right) + p_n\left(0, w_2, \cdots, w_m, y_1, \cdots, y_n\right) \cdot q_n\left(1, w_2, \cdots, w_m, y_1, \cdots, y_n\right)\right) \right)\right) $$

and

$$ H := \left( \left(y_1, \cdots, y_n\right) \longrightarrow EQ\left(\left(x_1, \cdots, x_n\right), \left(y_1, \cdots, y_n\right)\right) \left( \sum_{w_i} EQ\left(\left(v_2, \cdots, v_m\right)\right) \cdot q_n\left(1, w_2, \cdots, w_m, y_1, \cdots, y_n\right) \cdot q_n\left(0, w_2, \cdots, w_m, y_1, \cdots, y_n\right) \right)\right). $$

From a performance standpoint, the tree depth stays the same but we avoid running the sum-check of degree 3 for $m - 1$ steps and simplify the treatment of the input layer, which is a big plus.

@Al-Kindi-0 Al-Kindi-0 reopened this Aug 15, 2024
@plafer
Copy link
Contributor Author

plafer commented Aug 21, 2024

Implemented in facebook/winterfell#295

@plafer plafer closed this as completed Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants