The Sum-Check Protocol

Articles based on Justin Thaler's Proofs, Arguments, and Zero-Knowledge

Oct 08, 2024

The sum-check protocol is an interactive proof which can be leveraged in the design of SNARKs. The protocol on its own is information-theoretically secure1 because it does not involve cryptographic operations - only field additions and multiplications. When applied to SNARK design however, the sum-check protocol is combined with cryptographic primitives in order to yield an argument.2

The ultimate goal of the sum-check protocol in SNARK design is to reduce a claim made by a prover to a set of operations that can be executed efficiently by a corresponding verifier. In this article, we will learn the inner workings of the sum-check protocol while also touching on its practical applications.

Hypothetical Context

Before we drill into the mathematical definition of the sum-check protocol, lets try to understand the protocol at a high level in the context of a hypothetical arithmetic circuit, a prover P and a verifier V.3

P wants to convince V that some circuit was executed correctly with a specific witness and that they know a multivariate polynomial f which represents this execution. The polynomial f was derived from an algebraic representation of the circuit which contains M constraints and N variables. Somehow, V must be convinced that f is the reduction of M many constraint polynomials that evaluate to 0 when paired up with their corresponding variables from the execution trace.

Because we want our interactive proof to be succinct,4 we need V’s runtime to be much more efficient than the runtime exhibited by P’s construction of the proof for f. This efficiency can be achieved through the sum-check protocol, which allows P’s claim about M constraints to be reduced to a single claim about a random linear combination of those constraints. The randomness in this linear combination is provided by V and it allows the protocol to leverage the probabilities involved in the Schwartz-Zippel lemma which we have covered extensively in previous articles.

If the number of constraints M is equal to 2ᵐ, then the sum-check protocol involves m many rounds of interaction between P and V.5 V must also compute an evaluation of f at the end of the protocol, meaning that V’s runtime is O(m+z) where z is the cost to evaluate f at a single input in the relevant field. This is a drastic improvement from O(2ᵐ) which is P’s runtime when computing the polynomial f.

The Protocol

Now that we have some high level context for the sum-check protocol in SNARKs, lets define the protocol precisely as an interactive proof.

As above, the prover P has a v-variate polynomial f defined over a finite field Fp.

\(f(x_1,x_2,...,x_v)\)

P claims to know the sum of the evaluations of all Boolean inputs (the Boolean hypercube) over f.6

\(S = \sum_{(x_1,...,x_v) \in\{0,1\}^{v}}f(x_1,...,x_v)\)

P sends this sum S to V.

The remainder of the protocol is designed to reach a stage where V can confirm the correctness of S, to a sufficiently high probability of success, through a single evaluation of f. To reach this final stage however, V and P must execute v rounds of communication, each of which involve a single low degree univariate polynomial derived from f.

The v number of rounds of the protocol proceed as follows.

Round 1

Firstly, P sends the univariate polynomial g₁(X₁) which is claimed to be the sum of f over a subset of the entire set of variables, namely (x₂,…,xᵥ):

\(g_1(X_1) := \sum_{(x_2,...,x_v) \in\{0,1\}^{v-1}}f(X_1,x_2,...,x_v)\)

Lets compare this equation to the one for S. The difference is that the first variable x₁ of our v-variate polynomial has been replaced with a fixed variable X₁. The value of X₁ in our domain is either 0 or 1, so V is able to verify the following:

\(S=g_1(0) + g_1(1)\)

At this point, V has been able to compute the claimed sum of f over the Boolean hypercube much more efficiently than P did. However, V is not yet convinced that the claimed sum is in fact the correct sum. The remainder of the protocol will allow V to be convinced of the correctness of S.

V must also check that g₁ is a univariate polynomial of degree less than or equal to the degree of x₁ in f.

V then selects a random element r₁ from Fp and sends it to P.

Remaining Rounds

The process from round 1 is repeated for all remaining variables (x₂,…,xᵥ).

For each of the ith rounds, P sends a univariate polynomial gᵢ(Xᵢ) claimed to equal:

\(g_i(X_i) := \sum_{(x_{i+1},...,x_v) \in\{0,1\}^{v-i}}f(r_1,...,r_{i-1},X_i, x_{i+1},...,x_v)\)

V checks that gᵢ is a univariate polynomial of degree less than or equal to the degree of xᵢ in f.

V checks that the evaluation of the previous univariate polynomial and corresponding random element is equal to the sum of evaluations of gᵢ over Boolean inputs for Xᵢ:

\(g_{i-1}(r_{i-1}) = g_i(0) + g_i(1)\)

V selects another random element rᵢ from Fp and sends it to P.

Final Round

During the final round, V will gain access to gᵥ which is confirmed to be a univariate polynomial of sufficiently low degree equal to:

\(g_v(X_v)=f(r_1,...,r_{v-1},X_v)\)

V then selects a final random element rᵥ and confirms the following equality:7

\(g_v(r_v) = f(r_1,...,r_v)\)

If this check succeeds, then V is convinced as to the correctness of P’s original statement that the evaluation of f over all Boolean inputs is equal to S.

Minimal Worked Example

Lets walk through a worked example of the sum-check algorithm to improve our understanding of the protocol.

We define the multivariate polynomial f as follows:8

\(f(a, b, c) = a + 2b^2 + 3ac^3\)

The sum of the evaluations of f over the Boolean hypercube is S=18:

\(\begin{align} f(0,0,0)=0\\ f(0,0,1)=0 \\ f(0,1,0)=2 \\ f(0,1,1)=2 \\ f(1,0,0)=1 \\ f(1,0,1)=4 \\ f(1,1,0)=3 \\ f(1,1,1)= 6 \\ \end{align}\)

The first univariate polynomial g₁ provided by the prover is as follows:

\( \begin{align} g_1(a) = f(a,0,0) + f(a,0,1) + f(a,1,0) + f(a,1,1)\\ = (a) + (a+3a)+ (a+2) + (a+2+3a) \\ = 4+10a \end{align}\)

The verifier checks that g₁ is a univariate polynomial of degree at most 1 and that the sum of its evaluations over Boolean inputs is equal to S=18.

\(S=g_1(0)+g_1(1)= 4 + 10 + 4 = 18\)

The verifier selects a random r₁=3. The prover responds with a second univariate polynomial g₂:

\(\begin{align} g_2(b)=f(3,b,0)+f(3,b,1) \\ = (3+2b^2) + (3+2b^2+9) \\ = 15 + 4b^2 \end{align}\)

The verifier checks that g₂ is a univariate polynomial of degree at most 2 and that the sum of its evaluations over binary inputs is equal to g₁(r₁):

\(\begin{align} g_1(r_1)=4+30=34 \\ g_2(0)+g_2(1)= 15+15+4 =34 \\ \end{align} \)

The verifier selects a random r₂=2. The prover responds with a third univariate polynomial g₃:

\(\begin{align} g_3(c)=f(3,2,c) \\ = (3+8+9c^3) \\ = 11+9c^3 \end{align}\)

The verifier checks that g₃ is a univariate polynomial of degree at most 3 and that the sum of its evaluations over binary inputs is equal to g₂(r₂):

\(\begin{align} g_2(r_2)=15+16=31 \\ g_3(0)+g_3(1)= 11+11+9=31 \\ \end{align} \)

The verifier selects a random r₃=1 and confirms that f(r₁,r₂,r₃)=g₃(r₃):

\(\begin{align} f(3,2,1)=3+2\times2^2+3\times3\times1^3 = 20\\ g_3(1)= 11+9\times1^3 =20\\ \end{align} \)

After this final check, the verifier is convinced that the original claim by the prover that S=18 is in fact correct.

Example in Rust

The above worked example can be executed from the Rust crate found here:

$ cargo run -p sumcheck

Defined:        f   = a + 2b^2 + 3ac^3
Round 1:        g_1 = 4 + 10x   r_1 = 3 
Round 2:        g_2 = 15 + 4x^2 r_2 = 2 
Round 3:        g_3 = 11 + 9x^3 r_3 = 1 
Finalized:      g_3(r_3) = 20

The crate is factored to be an approachable learning resource. For example, instead of random values, the code uses the r₁, r₂, and r₃ from the worked example above. The protocol round functions are also factored to reflect the iterative breakdown covered in this article. So if you are interested in getting a better understanding of the protocol, a look into the code and its explanatory comments should be very helpful.

Sum-Check in Snark Design

Justin Thaler refers to the sum-check protocol as a “hammer in the design of efficient interactive proofs”.9 In Chapter 4 of the book, Thaler shows how the sum-check protocol can be used to implement application-specific interactive proofs. This includes interactive proofs for triangle counting, matrix multiplication, and Boolean circuit satisfiability.

The sum-check protocol is leveraged by many modern SNARK systems both as a fundamental component to GKR and as a stand-alone building block in its own right.

Information-theoretic secure systems are secure against adversaries with infinite compute and time. This is in contrast to systems whose security relies on the computational cost of cryptanalysis.

Cryptographic primitives such as checksums involved in the Fiat-Shamir transformation and witness commitments.

Note that the sum-check protocol is just one component of the proof system in this hypothetical scenario. But we can ignore the other components for now.

Remember that SNARKs require verifier time complexity to be logarithmic to the circuit size.

A circuit can easily have a billion constraints.

In the previous article about multilinear extensions we introduced the concept of the Boolean hypercube.

In the formal definition of the sum-check protocol, the verifier requires an oracle query to f for this step. In practice, the verifier would either be able to efficiently evaluate f(r₁,…,rᵥ) or would have to ask the prover to perform the evaluation and provide a proof for the claimed evaluation.

Note that we have replaced the X₁, X₂, and X₃ notation with a, b, and c here simply for readability.

Proofs, Arguments, and Zero-Knowledge, Chapter 4, Page 51.

Artwork by fullvector from Freepik.

sergerad.xyz