Multilinear Extension

Articles based on Justin Thaler's Proofs, Arguments, and Zero-Knowledge

Sep 13, 2024

In the previous article we took a deep dive into low-degree extension with univariate Lagrange interpolation. We also touched on how low-degree extension compares to multilinear extension (MLE) and pointed out that MLE can produce polynomials whose total degree is logarithmic to the size of the relevant domain.

In this article, we will unpack how MLE works and how it can be implemented, while also touching on the significance of MLE in the context of interactive proofs.

Multivariate Functions

We are now concerned with multivariate functions defined over the v-variate domain {0,1}ᵛ . This domain contains 2ᵛ unique Boolean tuples.

For example, f ( x_1,x₂) is a bivariate function where v=2 and the domain is {0,1}²:

\(\{(0,0),(0,1),(1,0),(1,1)\}\)

And f ( x_1,x_{2, …,}x_v) is a v-variate function whose domain contains 2ᵛ unique tuples.

These functions map Boolean tuples to elements in a finite field:

\(f : \{0,1\}^v \rightarrow \mathbb F\)

Importantly, every such function has an extension polynomial that is multilinear.

Multilinear Polynomials

In a multilinear polynomial, every variable is of degree at most 1. The total degree of a polynomial is the sum of the degrees of the variables in the highest-degree term.

So, for example, the following MLE of f ( x_1,x₂) over {0,1}² has a total degree of 2 due to the final term being composed of both x₁and x₂:

\(f(x_1,x_2) = a_0 + a_1x_1 + a_2x_2 + a_3x_1x_2\)

Since the highest-degree term of a multilinear polynomial can only involve v Boolean variables, its total degree cannot be greater than v. This means that the total degree of a MLE of a multivariate function is logarithmic to the domain size, 2ᵛ.

Remember that univariate low-degree extension required polynomials of degree n-1 which was linear to the domain size, n. As such, MLEs can be leveraged for the purposes of SNARK design in different ways to their univariate counterparts.

The Boolean Hypercube

Another way of describing our multivariate function is in relation to a Boolean hypercube. Specifically, our function maps a v-dimensional Boolean hypercube to a finite field element.

A Boolean hypercube is a geometric representation of the set of all Boolean tuples of a given length. The hypercube represents the domain {0,1}ᵛ which consists of all possible Boolean tuples of length v. We can visualize the cube as a graph in which each vertex corresponds to a Boolean tuple and is connected to another tuple that differs by a single bit. The distance between two vertices is the number of bit positions that the two tuples differ at (the Hamming distance).

The 3-dimensional Boolean hypercube can be visualized as follows:

             000---------001
            / |          / |
           /  |         /  |    3-Dimensional Boolean Hypercube:
          010--------011   |    {000,001,010,011,100,101,110,111}
          |   |        |   |    
          |   100------|-101
          |  /         |  /
          | /          | /
          110--------111

Notice that in the domain of {0,1}³, each vertex in the Boolean hypercube has 3 edges and the maximum Hamming distance is also 3.

When we compute the MLE of some multivariate function, we compute it over the Boolean hypercube that represents the input domain.

Extension of the Multivariate Function

As mentioned above, every multivariate function, f(x), that maps the v-dimensional Boolean hypercube to a finite field has a unique corresponding multilinear extension polynomial. The extension polynomial evaluates identically to f(x) for every Boolean tuple from the hypercube. However, the extension polynomial’s domain is Fᵛ rather than 2ᵛ because it has evaluations for every tuple in {0,1,2,…,|F|-1}ᵛ.As such, g(x) is a distance-amplifying encoding of f(x).

As always, distance-amplification allows us to exploit the probabilities involved in the Schwartz-Zippel lemma for the purposes of interactive proofs. With respect to multivariate polynomials over finite field F, the Schwartz-Zippel lemma states that any two distinct polynomials of total degree at most d can agree on at most d/|F| inputs. So if two multivariate functions over the domain {0,1}ᵛ disagree at even a single input, then any extensions of those functions of total degree at most d must differ virtually everywhere, given that d is much smaller than |F|.

Lagrange Interpolation of Multilinear Polynomials

Just as we did for univariate low-degree extension in the previous article, we can use Lagrange interpolation to perform MLE of some multivariate function.

Recall that basis polynomials are designed to evaluate to 1 for one particular input and 0 for all other possible input. Just as with the univariate approach, the algorithm involves a linear combination of basis polynomials. But this time, we create a basis polynomial for every (Boolean) variable in the multivariate function.

The MLE of f is the sum of all multilinear Lagrange basis polynomials with interpolating set {0,1}ᵛ, weighted by the evaluation the respective element from the interpolating set:

\(\tilde{f}(x_1,...,x_v)=\sum_{w\in\{0,1\}^v} f(w)\cdot \chi_w(x_1,...,x_v)\)

For every Boolean tuple w in {0,1}ᵛ, we multiply its evaluation under f with the set of multilinear Lagrange basis polynomials interpolated by the Booelan variables in w.

To evaluate the set of interpolated basis polynomials, we pair each Boolean tuple w with a tuple x. Because we intend to evaluate x over the MLE of f, the tuple x has a domain of {0,1,…|F|-1}ᵛ. For each pair, we evaluate an equation that will yield 1 when the i’th variables of w and x are equal and 0 if they are unequal while x is inside the domain of {0,1}ᵛ. This equation is repeated for every variable in w and x and the results are multiplied together:

\(\chi_w(x_1,...,x_v) := \prod^v_{i=1} (x_iw_i + (1-x_i)(1-w_i))\)

As you can see, the above algorithm allows us to evaluate some field element under the MLE of f through the use of multilinear Lagrange interpolation. Now lets apply the algorithm to a simple example.

We define a bivariate function over the domain {0,1}²:

\(f(0,0)=a, \hspace{2mm} f(0,1)=b, \hspace{2mm} f(1,0)=c, \hspace{2mm} (1,1)=d\)

The MLE of the bivariate function is then:

\(\tilde{f}(x_1,x_2) = a\cdot(1-x_1)(1-x_2) + b\cdot(1-x_1)x_2 + c\cdot x_1(1-x_2) + d\cdot x_1x_2\)

Notice that each of the four terms on the right hand side of the equation ensures that the MLE evaluates equally to our predefined values for the bivariate function:

\(\begin{align} f(0,0) = \tilde{f}(0,0) = a\cdot1 + b\cdot0 + c\cdot 0 + d\cdot 0 = a \\ f(0,1) = \tilde{f}(0,1) = a\cdot0+ b\cdot1 + c\cdot 0 + d\cdot 0 = b \\ f(1,0) = \tilde{f}(1,0) = a\cdot0 + b\cdot0 + c\cdot 1 + d\cdot 0 = c \\ f(1,1) = \tilde{f}(1,1) = a\cdot0 + b\cdot0 + c\cdot 0 + d\cdot 1 = d \end{align} \)

Now that we have observed the patterns at play, lets replace the variables a, b, c, and d with actual evaluations1 of f and see what the corresponding MLE looks like over a finite field Fp where p=5:

\(f(0,0)=1, \hspace{2mm} f(0,1)=2, \hspace{2mm} f(1,0)=3, \hspace{2mm} (1,1)=4\)

Given the above evaluations, our MLE equation is:

\(\tilde{f}(x_1,x_2) = (1-x_1)(1-x_2) + 2(1-x_1)x_2 + 3x_1(1-x_2) + 4 x_1x_2\)

Which we can simplify to:

\(\tilde{f}(x_1,x_2) = 1 + 2x_1 + x_2\)

So our MLE over Fp, where p=5, looks like this:

\(f(x_1,x_2) = \begin{matrix} & \textit{0} & \textit{1}\\ \textit{0} & 1 & 2\\ \textit{1} & 3 & 4 \end{matrix} \hspace{10mm} \tilde{f}(x_1,x_2) = \begin{matrix} & \textit{0} & \textit{1} & \textit{2} & \textit{3} & \textit{4} \\ \textit{0} & 1 & 2 & 3 & 4 & 0 \\ \textit{1} & 3 & 4 & 0 & 1 & 2 \\ \textit{2} & 0 & 1 & 2 & 3 & 4 \\ \textit{3} & 2 & 3 & 4 & 0 & 1 \\ \textit{4} & 4 & 0 & 1 & 2 & 3 \end{matrix}\)

With the above example, we can observe the distance-amplification of MLEs and understand how it was achieved. To further improve our understanding, we can look at how this algorithm can be implemented in Rust. A simple implementation of the algorithm for educational purposes is available here. Running the MLE package will execute the working example we covered above:

$ cargo run -p mle

Computed f(w) over the hypercube {0,1}^2:
f[0, 0] -> 1
f[0, 1] -> 2
f[1, 0] -> 3
f[1, 1] -> 4

Computed MLE(x) for every x in Fp:
1 2 3 4 0 
3 4 0 1 2 
0 1 2 3 4 
2 3 4 0 1 
4 0 1 2 3

Multilinear Polynomials in SNARK Design

Today, the most common approaches to SNARK design tend to involve some combination of a polynomial interactive oracle proof (IOP), a polynomial commitment scheme, and the Fiat-Shamir transformation.2 The polynomial commitment scheme transforms the IOP into succinct interactive arguments while the Fiat-Shamir transformation renders the arguments non-interactive and publicly verifiable.

Some polynomial IOP approaches leverage multivariate polynomials while others rely on univariate ones. Univariate polynomial commitment schemes can be transformed into multivariate ones at the expense of some computation.

The sum-check protocol is an interactive proof protocol which provides a means of designing efficient polynomial IOPs using multivariate polynomials. In the sum-check protocol, the verifier forces the prover to sum up all the evaluations of a multivariate polynomial over a Boolean hypercube. If you have read this article to completion, this should sound like a very familiar procedure to you.

In the next article, we will explore the sum-check protocol for interactive proofs.

The evaluations are arbitrary. We could choose any combination of elements from the respective finite field.

See section 19.1 of Proofs, Arguments, and Zero-Knowledge for a breakdown of the SNARK landscape.

Artwork by fullvector from Freepik.

sergerad.xyz