Understanding PLONK (Part 1): Plonkish Arithmetization

Arithmetization refers to transforming computations into mathematical objects and then performing zero-knowledge proofs. Plonkish arithmetization is a specific arithmetization method in the Plonk proof system. Prior to the introduction of Plonkish, the mainstream circuit representation was R1CS, widely used in systems such as Pinocchio, Groth16, and Bulletproofs. In 2019, the Plonk scheme proposed a seemingly retrograde circuit encoding method. However, due to its extreme application of polynomial encoding, Plonk is no longer limited to “addition gates” and “multiplication gates” in arithmetic circuits, but can support more flexible “custom gates” and “lookup gates”.

Let’s first review the circuit encoding of R1CS, which is the most commonly used arithmetic scheme. Then we will compare it with the introduction of Plonkish encoding.

Arithmetic circuits and arithmeticization of R1CS

An arithmetic circuit consists of several multiplication gates and addition gates. Each gate has “two input” pins and one “output” pin, and any output pin can be connected to multiple input pins of gates.

First, let’s look at a very simple arithmetic circuit:

This circuit represents such a calculation:

$(x_{1} + x_{2}) \cdot (2 \cdot x_{3}) = o u t$

电路中有4个变量，其中三个变量为输入变量 $(x_{1}, x_{2}, x_{3})$ ，一个输出变量 $o u t$ ，其中还有一个输入为常数，其值为 $2$ 。

A circuit has two states: “blank state” and “operational state”. When the input variables do not have specific values, the circuit is in the “blank state”, and we can only describe the relationship between the circuit wires, or the structural topology of the circuit.

The next step is to encode the “blank state” of the circuit, which means encoding the positions of each gate and their interconnecting wire relationships.

R1CS is centered around multiplication gates in the graph, using three “selector” matrices to connect the “left input,” “right input,” and “output” of the multiplication gates to respective variables.

Let’s start by looking at the left input of the multiplication gate at the top of the diagram. It can be described using the table below:

$10 x_{1} 1 x_{2} 1 x_{3} 0 o u t 0$

这个表格只有一行，因此我们可以用一个向量 $U = (0, 1, 1, 0, 0)$ 来代替，表示乘法门的左输入连接了两个变量， $x_{1}$ 和 $x_{2}$ 。记住，所有的加法门都会被展开成多个变量的相加（或线性组合）。

再看看其右输入，连接了一个变量 $x_{3}$ 和一个常数值，等价于连接了 $x_{3}$ 的两倍，那么右输入的选择子矩阵可以记为

$10 x_{1} 0 x_{2} 0 x_{3} 2 o u t 0$

这里同样可以用一个行向量 $V = (0, 0, 0, 2, 0)$ 来表示，其中的 $2$ 即为上图中电路的常数引线。

最后乘法门的输出按照上面的方法可以描述为 $W = (0, 0, 0, 0, 1)$ ，即输出变量为 $o u t$ ：

$10 x_{1} 0 x_{2} 0 x_{3} 0 o u t 1$

有了三个向量 $(U, V, W)$ ，我们可以通过一个「内积」等式来约束电路的运算：

$(U \cdot (1, x_{1}, x_{2}, x_{3}, o u t)) \cdot (V \cdot (1, x_{1}, x_{2}, x_{3}, o u t)) = (W \cdot (1, x_{1}, x_{2}, x_{3}, o u t))$

After simplifying this equation, we can obtain:

$(x_{1} + x 2) \cdot (2 \cdot x_{3}) = o u t$

如果我们把这几个变量换成赋值向量 $(1, x_{1}, x_{2}, x_{3}, o u t) = (1, 3, 4, 5, 70)$ ，那么电路的运算可以通过「内积」等式来验证：

$(U \cdot (1, 3, 4, 5, 70)) \cdot (U \cdot (1, 3, 4, 5, 70)) = W \cdot (1, 3, 4, 5, 70)$

而一个错误的赋值向量，比如 $(1, 3, 4, 0, 70)$ ，则不满足「内积等式」：

$(U \cdot (1, 3, 4, 0, 70)) \cdot (U \cdot (1, 3, 4, 0, 70)) \neq = W \cdot (1, 3, 4, 0, 70)$

左边运算结果为 $0$ ，右边运算结果为 $70$ 。当然，我们可以验证 $(1, 3, 4, 0, 0)$ 也是一组合法（满足电路约束）的赋值。

Not every circuit has an assignment vector. Circuits that have a valid assignment vector are called satisfiable circuits. Determining whether a circuit is satisfiable is an NP-Complete problem and also an NP-hard problem.

The two multiplication gates in the examples are not the same. The multiplication gate above has variables in both inputs, while the multiplication gate below has a variable on one side and a constant on the other side. For the latter type of “constant multiplication gate”, we also consider them as special “addition gates”. As shown in the diagram below, the multiplication gate in the bottom right of the left circuit is equivalent to the addition gate in the bottom right of the right circuit.

那么如果一个电路含有两个以上的乘法门，我们就不能用 $U, V, W$ 三个向量之间的内积关系来表示运算，而需要构造「三个矩阵」的运算关系。

Multiple multiplication gates

For example, in the circuit shown below, there are two multiplication gates, and both their left and right inputs involve variables.

This circuit represents such a calculation:

$(x_{1} + x 2) \cdot (x 3 \cdot x 4) = o u t$

We encode the circuit based on the multiplication gates. In the first step, the multiplication gates in the circuit are numbered sequentially (the order of numbering doesn’t matter as long as it is consistent). The two multiplication gates in the diagram are encoded as #1 and #2.

然后我们需要为每一个乘法门的中间值引线也给出变量名：比如四个输入变量被记为 $x_{1}, x_{2}, x_{3}, x_{4}$ ，其中 $x_{5}$ 为第二个乘法门的输出，同时作为第一个乘法门的右输入。而 $o u t$ 为第一个乘法门的输出。于是我们可以得到一个关于变量名的向量：

$(x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, o u t)$

The “blank state” of this circuit can be encoded using the following three matrices:

$U, V, W \in F^{n \times m}$

其中 $n$ 为乘法门的数量，而 $m$ 大致为引线的数量。每一个矩阵的第 $i$ 行「选择」了第 $i$ 个乘法门的输入输出变量。比如我们定义电路的左输入矩阵 $U$ ：

$x_{1} 10 x_{2} 10 x_{3} 01 x_{4} 00 x_{5} 00 o u t 00 i 12$

其中第一个乘法门的左输入为 $(x_{1} + x_{2})$ ，第二个乘法门的左输入为 $x_{3}$ 。右输入矩阵 $V$ 定义为：

$x_{1} 00 x_{2} 00 x_{3} 00 x_{4} 01 x_{5} 10 o u t 00 i 12$

其中1号门的右输入为 $x_{5}$ ，第二个乘法门的右输入为 $x_{4}$ 。最后定义输出矩阵 $W$ ：

$x_{1} 00 x_{2} 00 x_{3} 00 x_{4} 00 x_{5} 01 o u t 10 i 12$

我们把所有的引线赋值看作为一个向量： $a$ （这里用字母 $a$ ，取自 Assignments 首字母）

In the above example, the “assignment vector” is

$a = (x_{1}, x_{2}, x_{3}, x_{4}, x_{5}, o u t)$

So we can easily verify the following equation.

$(U \cdot a) \circ (V \cdot a) = (W \cdot a)$

其中符号 $\circ$ 为 Hadamard Product，表示「按位乘法」。展开上面的按位乘法等式，我们可以得到这个电路的运算过程：

$[x_{1} + x_{2} x_{3}] \circ [x_{5} x_{4}] = [o u t x_{5}]$

请注意，通常「赋值向量」中需要一个固定赋值为 $1$ 的变量，这是为了处理加法门中的常量输入。

Advantages and disadvantages

由于 R1CS 编码以乘法门为中心，于是电路中的加法门并不会增加 $U, V, W$ 矩阵的行数，因而对 Prover 的性能影响不大。R1CS 电路的编码清晰简单，利于在其上构造各种 SNARK 方案。

The encoding scheme in the 2019 Plonk paper requires encoding both addition and multiplication gates, which seems to increase the number of constraints and decrease the proving performance. However, the Plonk team subsequently introduced additional gates besides multiplication and addition, such as gates for range checks and XOR operations. Moreover, Plonk supports any gate with polynomial relations between its inputs and outputs, known as Custom Gates, as well as state transition gates for implementing RAM. With the introduction of lookup gates, the Plonk scheme gradually became the preferred choice for many applications, and its encoding method also gained a dedicated term: Plonkish.

Plonkish Arithmetic Door

回看下例子电路，我们把三个门全都编号， $1, 2, 3$ ，同时把加法门的输出值也标记为变量 $x_{6}$ 。

Clearly, the above circuit satisfies three constraints:

$x_{1} + x_{2} = x_{6}$
$x_{3} \cdot x_{4} = x_{5}$
$x_{6} \cdot x_{5} = o u t$

我们定义一个矩阵 $W \in F^{n \times 3}$ 来表示约束（ $n$ 为算术门的数量）：

$i 123 w_{a} x_{6} x_{1} x_{3} w_{b} x_{5} x_{2} x_{4} w_{c} o u t x_{6} x_{5}$

为了区分加法和乘法，我们再定一个向量 $Q \in F^{n \times 5}$ 来表示运算符

$i 123 q_{L} 010 q_{R} 010 q_{M} 101 q_{C} 000 q_{O} 111$

So we can represent the three constraints using the following equations:

$q_{L} \circ w_{a} + q_{R} \circ w_{b} + q_{M} \circ (w_{a} \cdot w_{b}) + q_{C} - q_{O} \circ w_{c} = 0$

If we substitute and expand the equation above, we can obtain the following constraint equations:

$010 \circ x_{6} x_{1} x_{5} + 010 \circ x_{5} x_{2} x_{4} + 101 \circ x_{6} \cdot x_{5} x_{1} \cdot x_{2} x_{3} \cdot x_{4} = 111 \circ o u t x_{6} x_{5}$

Simplified to:

$0 x_{1} 0 + 0 x_{2} 0 + x_{6} \cdot x_{5} 0 x_{3} \cdot x_{4} = o u t x_{6} x_{5}$

This is exactly the calculation constraint of three arithmetic operations.

总结下，Plonkish 需要一个矩阵 $Q$ 来描述电路空白态，而所有的赋值则写入了 $W$ 矩阵。对于 Prover 和 Verifier 的交换协议， $W$ 是 Prover 的 witness，属于秘密知识，对 Verifier 保密， $Q$ 矩阵代表了一个实现双方约定共识的电路描述。

不过仅仅有 $Q$ 矩阵是不足以精确描述上面的例子电路。

Copy Constraints

比较下面两个电路，它们的 $Q$ 矩阵完全相同，但它们却完全不同。

两个电路的区别在于 $x_{5}, x_{6}$ 是否被接入了 #1 号门。如果让 Prover 直接把电路赋值填入 $W$ 表格，一个「诚实的」Prover 会在 $w_{a, 1}$ 和 $w_{c, 2}$ 两个位置填上相同的值；而一个「恶意的」Prover 完全可以填上不同的值。如果恶意 Prover 在 $w_{b, 1}$ 和 $w_{c, 3}$ 也填入不同的值，那么实际上 Prover 证明的是上图右边的电路，而非是和 Verifier 共识过的电路（左边)。

$i 123 w_{a} x_{6} x_{1} x_{3} w_{b} \underline{x_{5}} x_{2} x_{4} w_{c} o u t x_{6} \underline{x_{5}}$

我们需要增加新的约束，强制要求右边电路图中 $x_{6} = x_{7}$ 和 $x_{5} = x_{8}$ 。这等价于我们要求 Prover 把同一个变量填入表格多个位置时，必须填入相等的值。

这就需要一类新的约束——「拷贝约束」，即 Copy Constraint。Plonk 采用「置换证明」保证 $W$ 表格中多个位置上的值满足拷贝关系。我们继续用上面这个电路图的案例来说明其基本思路：

设想我们把 $W$ 表格中的所有位置索引排成一个向量：

$σ_{0} = (w_{a, 1}, w_{a, 2}, w_{a, 3}, \underline{w_{b, 1}}, w_{b, 2}, w_{b, 3}, w_{c, 1}, w_{c, 2}, \underline{w_{c, 3}})$

然后把应该相等的两个位置互换，比如上图中要求 $w_{a, 1} = w_{c, 2}$ 和 $w_{b, 1} = w_{c, 3}$ 。于是我们得到了下面的位置向量：

$σ = (w_{c, 2}, w_{a, 2}, w_{a, 3}, \underline{w_{c, 3}}, w_{b, 2}, w_{b, 3}, w_{c, 1}, w_{a, 1}, \underline{w_{b, 1}})$

然后我们要求 Prover 证明： $W$ 表格按照上面的置换之后，仍然等于自身。置换前后的相等性可以保证 Prover 无法作弊。

Here’s another example, when constraining that three (or more) values in a vector must be equal, you only need to shift the values at these three (or more) positions cyclically (left or right), and then prove that the shifted vector is equal to the original vector. For example:

$A = (b_{1}, b_{2}, \underline{a_{1}}, b_{3}, \underline{a_{2}}, b_{4}, \underline{a_{3}})$

如果要证明 $a_{1} = a_{2} = a_{3}$ ，那么只需要证明：

$A^{'} = (b_{1}, b_{2}, \underline{a_{3}}, b_{3}, \underline{a_{1}}, b_{2}, \underline{a_{2}}) = ? A$

在经过置换的向量 $A^{'}$ 中， $a_{1}, a_{2}, a_{3}$ 依次右移交换，即 $a_{1}$ 放到了原来 $a_{2}$ 的位置，而 $a_{2}$ 放到了 $a_{3}$ 的位置， $a_{3}$ 则放到了 $a_{1}$ 的位置。

如果 $A^{'} = A$ ，那么 $A^{'}$ 和 $A$ 所有对应位置上的值都应该相等，可得： $a_{1} = a_{3}$ ， $a_{2} = a_{1}$ ， $a_{3} = a_{2}$ ，即 $a_{1} = a_{2} = a_{3}$ 。这个方法可以适用于任意数量的等价关系。（后续证明两个向量相等的方法请见下章）

那么如何描述电路赋值表格中的交换呢？我们只需要记录 $σ$ 向量即可，当然 $σ$ 向量也可以写成表格的形式：

$i 123 σ_{a} w_{c, 2} w_{a, 2} w_{a, 3} σ_{b} \underline{w_{c, 3}} w_{b, 2} w_{b, 3} σ_{c} w_{c, 1} w_{a, 1} \underline{w_{b, 1}}$

加上 $σ$ ，空白电路可以描述为 $(Q, σ)$ ，电路的赋值为 $W$

$Plonkish_{0} ≜ (Q, σ; W)$

Comparison again

R1CS 的 $(U, V, W)$ 表格的宽度与引线的数量有关，行数跟乘法门数量有关。这个构造相当于把算术电路看成是仅有乘法门构成，但每个门有多个输入引脚（最多为所有引线的数量）。而 Plonkish 则是同等对待加法门与乘法门，并且因为输入引脚只有两个，所以 $W$ 表格的宽度固定，仅有三列（如果要支持高级的计算门，表格可以扩展到更多列）。这一特性是 Plonk 可以利用 Permutation Argument 实现拷贝约束的前提。

I understand. Here is the translated text:

…, y así nuestras restricciones lineales son simplemente restricciones de cableado que se pueden reducir a una comprobación de permutación.

按照 Plonk 论文的统计，一般情况下，算术电路中加法门的数量是乘法门的两倍。如果这样看来， $W$ 表格的长度会三倍于 R1CS 的矩阵。但这个让步会带来更多的算术化灵活度。

Circuit verification protocol framework

With the description and assignment of the circuit blank structure, we can roughly describe the protocol framework of Plonk.

首先 Prover 和 Verifier 会对一个共同的电路进行共识， $(Q, σ)$ 。假设电路的公开输出为 $o u t = 99$ ，而 $(x_{1}, x_{2}, x_{3}, x_{4})$ 为秘密输入。

Prover 填写 $W$ 矩阵（Verifier 不可见）：

$i 1234 w_{a} x_{6} x_{1} x_{3} 0 w_{b} \underline{x_{5}} x_{2} x_{4} 0 w_{c} [o u t] x_{6} \underline{x_{5}} [o u t]$

其中增加的第四行是为了增加一个额外的算术约束： $o u t = 99$ ，把 $o u t$ 值显示地表示在 $Q$ 矩阵中。

相应的那么 Prover 和 Verifier 共识的 $Q$ 矩阵为

$i 1234 q_{L} 0100 q_{R} 0100 q_{M} 1010 q_{C} 00099 q_{O} 1111$

其中第四行约束，保证 $o u t = 99$ ，可以把 $(q_{L} = 0, q_{R} = 0, q_{M} = 0, q_{C} = 99, q_{O} = 1)$ 代入下面的算术约束，可得 $99 - w_{c} = 0$ ，即 $w_{c, 4} = 99$ 。

$q_{L} \circ w_{a} + q_{R} \circ w_{b} + q_{M} \circ (w_{a} \cdot w_{b}) + q_{C} - q_{O} \circ w_{c} = 0$

为了保证第一行的 $w_{c}$ 也必须为 $99$ ，这就需要在 $σ$ 矩阵中添加额外的一条拷贝约束：让 $o u t$ 变量的位置 $(w_{c, 1})$ 与第四行的输出 $w_{c, 4}$ 交换对调：

$i 1234 σ_{a} w_{c, 2} w_{a, 2} w_{a, 3} w_{a, 4} σ_{b} \underline{w_{c, 3}} w_{b, 2} w_{b, 3} w_{b, 4} σ_{c} [w_{c, 4}] w_{a, 1} \underline{w_{b, 1}} [w_{c, 1}]$

如果 Prover 是诚实的，那么对于 $i \in (1, 2, 3, 4)$ ，下面的算术约束等式成立：

$q_{L, i} \circ w_{a, i} + q_{R, i} \circ w_{b, i} + q_{M, i} \circ (w_{a, i} \cdot w_{b, i}) + q_{C, i} - q_{O, i} \circ w_{c, i} = 0$

The general idea of the verification protocol is as follows:

协议开始：Prover 如实填写 $W$ 表格，然后把 $W$ 表格的每一列进行编码，并进行多项式编码，并把编码后的结果发送给 Verifier

Agreement verification phase: The Verifier and Prover interact further to verify whether the following equation holds true:

$q_{L} (X) \cdot w_{a} (X) + q_{R} (X) \cdot w_{b} (X) + q_{M} (X) \cdot (w_{a} (X) \cdot w_{b} (X)) + q_{C} (X) - q_{O} (X) \cdot w_{c} (X) = ? 0$

当然这个验证还不够，还要验证 $(σ_{a} (X), σ_{b} (X), σ_{c} (X))$ 与 $(w_{a} (X), w_{b} (X), w_{c} (X))$ 之间的关系。还有，Verifier 如何通过多项式来验证电路的运算，请看后续章节。

References

I understand. Here is the translation:

[BG12] Bayer, Stephanie, and Jens Groth. “Efficient zero-knowledge argument for correctness of a shuffle.” Annual International Conference on the Theory and Applications of Cryptographic Techniques. Springer, Berlin, Heidelberg, 2012.
I understand. Here is the text:

[GWC19] Ariel Gabizon, Zachary J. Williamson, and Oana Ciobotaru. “Plonk: Permutations over lagrange-bases for oecumenical noninteractive arguments of knowledge.” Cryptology ePrint Archive (2019).

World of Z2O-K7E