About this Book

Zero-knowledge cryptography is a fascinating area, where a party can prove a statement without revealing any extra information about it; proving that something is true without revealing what makes that "something" true.

Circom is a hardware description language (HDL) specifically designed for creating zero-knowledge proofs. It enables developers to create arithmetic circuits, which are then used to generate proofs that demonstrate a certain statement is true, without revealing any additional information about the inputs used to make that statement.

Circom has opened up new possibilities for creating privacy-preserving applications, such as digital identities, voting systems, and financial transactions, where proving the validity of a statement is necessary, but keeping the underlying data private is critical.

On the other hand, zero-knowledge proofs can serve as proofs of honest computation, without necessarily hiding the data underneath. This paves the way on new methods of off-loaded computation, most notable example being the zk-rollup ecosystem.

Organization

The book is organized as follows:

  • Preliminary has some preliminary theory for those who are interested in it. It is not 100% required to know the theory & math behind the relevant cryptography areas to write good Circom code; however, it could allow the reader to:
    • Write more efficient code by utilizing mathematical tricks.
    • Understand certain low-level code.
    • Consider more edge cases & write more secure code.
  • Basics has several hello-world level Circom programs, to get the reader started. The reader is free to skip this part & come back later, however it is strongly recommended to understand how the programs here work.
  • Bits describe methods related to bits, which are signals with value 0 or 1; you can think of them as booleans as well.
  • Comparators describe comparators such as equality, less-than, greater-than-or-equal and such. It also describes some more specialized methods like alias-check and sign-check.
  • Control-Flow describes how if-else works in the circuit world.
  • Arrays describe array methods.
  • Hashing describe circuit-friendly hashes, such as Poseidon and MiMC.
  • Merkle Trees describe Merkle Trees and their usage in Circom circuits.
  • Advanced section has some advanced concepts such as alternative provers, or large circuits.

Follow Along

You can clone the repository and examine the circuits yourself, we provide documentation within the code as well. We also provide tests via Circomkit that demonstrate each circuit described here. In the repository, you can run the command below to run all tests:

yarn test

Alternatively, you can play around with the circuits without ever leaving your browser, just visit zkrepl.dev!

Resources

There are many resources about Circom & ZK out there, to list a few:

Preliminaries

Although not mandatory, you may want to check out some preliminary theory to better understand some concepts within Circom.

Group Theory

The values within our circuits are elements of what is known as a finite field. To understand this, we need a bit of abstract algebra background, especially group theory.

Groups

A group is a set equipped with a binary operation that combines any two elements to form a third element, satisfying certain properties. A group is shown as and consists of the following components:

  • A set of elements .
  • A binary operation (denoted as ) that takes two elements and produces a third element.

The operation must satisfy the following properties:

  • Closure: For any two elements , the result of the operation is also in the group: . It is said that the group is closed under its binary operation.

  • Identity: There exists an element , called the identity element, such that for any element , the operation .

  • Inverse: For every element , there exists an element , called the inverse of a, such that . The inverse of is denoted as .

  • Associativity: For any three elements , the operation is associative, meaning . This property ensures that the order of operations does not matter.

There is an additional property as well, called the commutative property or abelian property. A group is said to be Abelian if the binary operation is commutative, meaning for all elements .

If the group has a finite number of elements, it is called a finite group.

Operation Notation

For the binary operation, we can use the additive notation or multiplicative notation.

  • Additive:
  • Multiplicative:

Examples

  • The integers under addition .
  • The integers modulo under addition .

Rings

A ring is a set equipped with two binary operations, addition and multiplication, that satisfy certain axioms. A ring consists of the following components:

  • A set of elements .
  • An addition operation (denoted as ) that takes two elements and produces a third element.
  • A multiplication operation (denoted as ) that takes two elements and produces a third element.

The operations must satisfy the following properties:

  • Additive + Multiplicative Closure: For any two elements , the result of the addition is also in the ring: and the result of the multiplication is also in the ring: . The ring is closed under both addition and multiplication.

  • Additive + Multiplicative Associativity: For any three elements , the addition and multiplication operations are associative, meaning and . This property ensures that the order of operations does not matter.

  • Additive Identity: There exists an element , called the additive identity, such that for any element , the addition . Nothing is said about multiplication yet.

  • Additive Inverse: For every element , there exists an element , called the additive inverse of , such that . The inverse of is denoted as .

  • Addition Commutativity: The addition operation is commutative, meaning for all elements .

  • Distributivity: For any three elements , the ring satisfies the distributive property, meaning and .

If the ring has a multiplicative identity, i.e., an element such that for all , then the ring is called a ring with unity and that element is called a unity.

If the multiplication is commutative, then the ring is called a commutative ring.

If the ring has a finite number of elements, it is called a finite ring.

Examples

  • The set of all integers, and is a commutative ring with unity.
  • The set of all rational numbers.
  • The set of all real numbers.
  • The set of all complex numbers.

Fields

A field is a ring with the following properties:

  • is a commutative ring.
  • There is a non-zero unity .
  • Every non-zero element have a multiplicative inverse such that .

If the field has a finite number of elements, it is called a finite field. The ring of integers modulo , denoted as , where is a prime number, is a finite field. This one is particularly important in cryptography!

In Circom, when you choose a prime with -p or --prime option, you actually choose the order of the finite field that the circuit will be built upon. Here is quick snippet to see the order yourself within the circuit:

template OrderMinus1() {
    log("P - 1 = ", -1);
}

This will print a huge number on your screen, and that number equals in the field; adding 1 to that gives you the order.

Commitments

In a commitment scheme, Alice can commit to a value to obtain a commitment . At some point later, Bob would like for Alice to reveal the committed value behind , and show that indeed Alice was the one that had committed in the first place.

A rough informal sketch of this interaction can be shown as:

sequenceDiagram
	actor A as Alice
	actor B as Bob

  note over A: [x] := Commit(x)
  A ->> B: [x]

  note over A, B: sometime later
  B ->> A: show me!
  A ->> B: x

  note over B: im convinced.

A cryptographic commitment scheme has two notable properties:

  • Hiding: A commitment should reveal nothing about the underlying .
  • Binding: A commitment , should only be revealed to , i.e. for some we shouldn't be able to compute a commitment .

Arithmetic

Our starting point is arithmetics, for a good reason: in the world of arithmetic circuits, we are only allowed to do two things: addition and multiplication.

First, we will look at adding and multiplying many numbers within a circuit, then we move on to more advanced operations such as multiplicative inverse, integer-division and modular arithmetic.

Multiplier

template Multiplier(n) {
  assert(n > 1);
  signal input in[n];
  signal output out;

  signal inner[n-1];
  inner[0] <== in[0] * in[1];
  for(var i = 0; i < n - 2; i++) {
    inner[i+1] <== inner[i] * in[i+2];
  }

  out <== inner[n-2];
}

There may be several "Hello World!"s of Circom, and I believe Multiplier circuit is one of them. All that we are doing here is computing the product of all elements in an array.

Since multiplying more than two signals at once is non-quadratic, we have to do this two at-a-time. Here is how multiplying 4 numbers looks like:

flowchart LR
  in_1(("in[1]")) --> i0
  in_0(("in[0]")) --> i0
  in_2(("in[2]")) --> i1
  in_3(("in[3]")) --> i2
  i0["inner[0]"] --> i1
  i1["inner[1]"] --> i2
  i2["inner[2]"] --> out(("out"))

There is really not much more to talk about in this circuit. It is simply a great way to test out whether you can understand the Circom syntax, loops, template parameters and such.

If you really like having everything as "gates" you could write the circuit with a Multiplier gate:

template Mul() {
  signal input in[2];
  signal output out;

  out <== in[0] * in[1];
}

Then, within the Multiplier you can use the Mul gate for each multiplication.

Soundness

Imagine that you would like to use the circuit above to prove that you know the prime factors for some number:

Would it be okay to use the circuit above as given, with out as the only public signal?

If you think about this for a while, you will realize that the circuit does not care if a factor is 1 or not! Meaning that one can provide the same proof just by providing an array [k, 1, ..., 1] since:

The circuit author is responsible from checking these edge-cases, and writing the necessary constrains to prevent such soundness errors.

Adder

template Adder(n) {
  signal input in[n];
  signal output out;

  var lc = 0;
  for (var i = 0; i < n; i++) {
    lc += in[i];
  }
  out <== lc;
}

Adding an array of values and returning the sum is a common operation within circuits.

No matter how large is, a sum operation is just a single linear constraint! Formally, for an array with sum the following is a valid rank-1 constraint:

The variable lc within the code stands for "linear combination" and its just a variable that stores the expression on the right hand-side in equation above:

lc = in[0] + in[1] + ... + in[n-1]

Inverse

template Inverse() {
  signal input in;
  signal output out;

  assert(in != 0);

  signal inv <-- in != 0 ? 1 / in : 0;
  signal isZero <== 1 - (in * inv);

  // if isZero, since in = 0 this holds
  // otherwise, it holds due to in * 0 == 0
  in * isZero === 0;
  inv * isZero === 0;

  out <== inv;
}

The (multiplicative) inverse of an element within a prime field is denoted as such that . An additive inverse of is shown as and it holds that . Inverting an element within the circuit is inefficient, so the inverse is computed off-circuit and a constraint is added to make sure their multiplications result in 1.

Zero does not have a multiplicative inverse though, and we can have a run-time assertion assert(in != 0) as shown above to prevent that. However, a run-time assertion does not prevent a malicious prover to provide an altered witness that has a zero value for in. We must handle that case as well, to output a certain value fo inv when the input is zero. In the circuit above, we set inv to be zero as well, and the constraint inv * isZero === 0 holds when the input is non-zero but inv is zero.

There are two ways to find an inverse outside the circuit:

  1. Use Extended Euclidean Algorithm
  2. Use Fermat's Little Theorem

Interested reader can check them out on the web!

IntDiv

template IntDiv(n) {
  signal input in[2];
  signal output out;

  // divisor must be non-zero
  signal is_non_zero <== IsZero()(in[1]);
  0 === is_non_zero;

  // compute the quotient and remainder outside the circuit
  var quot_hint = in[0] \ in[1];
  var rem_hint = in[0] % in[1];
  signal quot <-- quot_hint;
  signal rem <-- rem_hint;

  // contrain the division operation
  // in[0] / in[1] is defined as the unique pair (q, r) s.t.
  // in[0] = in[1] * q + r and 0 <= r < |in[1]|
  in[0] === quot * in[1] + rem;

  // OPTIONAL: quot edge case is when `rem = 0` and `in[1] = 1`
  // signal quot_is_valid <== LessEqThan(n)([quot, in[0]]);
  // 1 === quot_is_valid;

  signal rem_is_valid <== LessThan(n)([rem, in[1]]);
  1 === rem_is_valid;

  out <== quot;
}

Doing integer division within a circuit is more costly than what one might one expect! Consider two -bit numbers and . An integer division implies the following equation:

where is the quotient, and is the remainder such that . As an example, consider the division . We have:

giving us the quotient and remainder . Notice that if we allowed , we could have:

which would be wrong! On a side note, one may look at the case if we allowed such as the following:

which is again wrong. Thankfully, the constraint on remainder prevents this as well; here we get which corresponds to a very large number in the prime field, much larger than in most cases. If the integer division takes place over very large numbers, one may additionally check for .

Bits

A bit value can only be 0 or 1. As such, they can be treated as booleans too, where 0 is false and 1 is true. It is often useful to convert a number to it's binary representation, or simply assert that a number can fit into n-bits for some n that is known at compile time.

AssertBit

template AssertBit() {
  signal input in;

  in * (in - 1) === 0;
}

To assert that a given number is bit (i.e. 0 or 1) you only have to check that , as there are only two numbers whose square equals to itself: 0 and 1. This expression is equivalent to , which is how we've written in the circuit above.

Normally people assert that a signal is bit without the template above; instead, it is preferred to do this inline where it is needed. Well, nothing really stops you from using templates.

Num2Bits

template Num2Bits(n) {
  assert(n < 254);
  signal input in;
  signal output out[n];

  var lc = 0;
  var bit_value = 1;

  for (var i = 0; i < n; i++) {
    out[i] <-- (in >> i) & 1;
    AssertBit()(out[i]);

    lc += out[i] * bit_value;
    bit_value <<= 1;
  }

  lc === in;
}

Num2Bits is an often used circuit that decomposes a given number to n bits. In doing so, it will assert that the number is representable by that many bits too. The binary decomposition happens the way it is done in most languages: shift and get the last bit, within a loop.

The operation (in >> i) & 1 is non-quadratic, so we have to use <-- there to assign the result to out[i] signal. At this point, out[i] can be anything given by a malicious prover, we have to constrain it. At the very least, we can assert out[i] to be a bit.

The important constraint in this template is that out when converted back to decimal should equal our number in. We make that constraint using the lc variable, which really acts like a sum in programming. It is important to remember at this point that Circom is an HDL, instead of a programming language. So, if we were to treat lc like some variable that is storing the sum, it won't really be the reality. Here instead lc is a linear-combination of the signals:

lc = out[0]*1 + out[1]*2 + out[2]*4 + ... + out[n-1]*2^(n-2)

Thankfully, this entire equality is a quadratic constraint and we can simply check if lc === in to ensure the bitwise representation is correct.

Exercise: How would you modify Num2Bits above to obtain a template like AssertBits(n) that ensures a number can be represented by n bits?

Bits2Num

template Bits2Num(n) {
  assert(n < 254);
  signal input in[n];
  signal output out;

  var lc = 0;
  var bit_value = 1;
  for (var i = 0; i < n; i++) {
    AssertBit()(in[i]);

    lc += in[i] * bit_value;
    bit_value <<= 1;
  }

  out <== lc;
}

If we can convert from an n-bit number to its binary representation, surely we should be able to convert from the binary representation with n-bits to the number itself. We do that with Bits2Num. This operation is rather straightforward, we just need to compute:

We use bit_value to keep track of , and this entire sum expression is stored within the lc (linear combination). In the end, we constrain the output signal to be equal to this expression.

Note that for both Num2Bits and Bits2Num, the most-significant bit is the last element of the array, and least-significant bit is the first element of the array. To demonstrate, consider the 4-bit number 11, normally shown as in maths. However, in these circuits we store the array [1, 1, 0, 1], in the opposite order!

Function: nbits

function nbits(n) {
  var ans = 0;

  var tmp = 1;
  while (tmp - 1 < n) {
      ans++;
      tmp <<= 1;
  }

  return ans;
}

We might want to find the minimum number of bits required to represent a known value; we use a function called nbits (as in Circomlib) for that. Here, our parameter n is known at compile time, perhaps a generic argument or some constant somewhere.

The idea of this function is to construct until it becomes greater than our input.

Logic Gates

It is often required to do some logical operations like AND, OR, NOT and XOR within a circuit. For the circuits within this chapter, we can assume that the inputs are asserted to be bits, i.e. they are either 0 or 1.

AND

template AND() {
  signal input in[2];
  signal output out;

  out <== in[0]*in[1];
}

The logic table of AND is as follows:

01
000
101

This can be achieved by simply multiplying the two inputs!

OR

template OR() {
  signal input in[2];
  signal output out;

  out <== in[0] + in[1] - in[0]*in[1];
}

The logic table of OR is as follows:

01
001
111

We almost achieve the result by adding the two numbers, except that we get 2 when both are 1. Well, AND is only 1 when both numbers are 1, so we subtract it from the result to solve that issue.

What we have is equivalent to in[0] + in[1] - AND(in).

XOR

template XOR() {
  signal input in[2];
  signal output out;

  out <== in[0] + in[1] - 2*in[0]*in[1];
}

The logic table of XOR is as follows:

01
001
110

We can use the same trick for OR, just once more, to make the 1 + 1 turn to a zero. What we have is equivalent to in[0] + in[1] - 2 * AND(in).

NOT

template NOT() {
  signal input in;
  signal output out;

  out <== 1 - in;
}

A NOT gate maps 0 to 1, and 1 to 0. We can achieve this by simply subtracting the signal from 1. Since and .

NAND

template NAND() {
  signal input in[2];
  signal output out;

  out <== 1 - in[0]*in[1];
}

NAND is equivalent to NOT(AND(in)), giving us the circuit above.

NOR

template NOR() {
  signal input in[2];
  signal output out;

  out <== 1 - (in[0] + in[1] - in[0]*in[1]);
}

NOR is equivalent to NOT(OR(in)), giving us the circuit above.

Comparators

Comparators compare two numbers in an arithmetic circuit. At the most basic level, we are interested in comparing a number to zero. Then, we can derive an equality check by simply checking if the difference of inputs is equal to zero.

  • is the IsZero circuit

We usually make use of LessThan circuit which does a < b check, and derive all other remaining comparisons from that. This should make sense, since assuming both a and b are integers we can say:

  • is the LessThan circuit

IsZero

template IsZero() {
  signal input in;
  signal output out;

  signal inv <-- in != 0 ? 1 / in : 0;
  out <== 1 - (in * inv);

  in * out === 0;
}

Here, we check if the given number has a multiplicative inverse. The only number that does not have an inverse is 0. However, inverting a number is a non-quadratic constraint (hence the <-- instead of <==) so must assume that signal inv can be any arbitrary value from the perspective of an adversarial prover. To constrain inv, we can invert it again! Notice that:

We can't really add a constraint like 1 === in * inv because it will simply fail when in is 0. That is why we make use of a ternary operator to simply set inv to be zero if in is zero. So, we may expect in * inv to equal 1 if indeed the number has an inverse, meaning that the number is non-zero. By subtracting this from 1, we essentially do a logical NOT operation on the result, so that if the in has an inverse out becomes 0.

Let's examine what happens when an adversary inputs some random inv on both cases where in is zero and non-zero:

When in = 0 and inv = x all constraints are looking good:

  • out <== 1 - 0 constraint is fine
  • 0 * out <== 0 constraint is fine

When in != 0 and inv = x the constraint is indirectly implied:

  • out <== 1 - in * x
  • in * (1 - in * x) <== 0

Here, out could be anything but for in * (1 - in * x) <== 0 to pass the only option is that 1 === in * x so x must indeed be the inverse of in.

IsEqual

template IsEqual() {
  signal input in[2];
  signal output out;

  out <== IsZero()(in[1] - in[0]);
}

As we mentioned before, if a = b then a - b = 0, allowing us to re-use the IsZero circuit.

LessThan

template LessThan(n) {
  assert(n <= 252);
  signal input in[2];
  signal output out;

  component toBits = Num2Bits(n+1);
  toBits.in <== ((1 << n) + in[0]) - in[1];

  out <== 1 - toBits.out[n];
}

There is something going on here as you might notice, why are we using Num2Bits and what is that 1 << n doing there? It will all make sense.

The trick of LessThan is to think of bitwise-subtraction. Think of subtracting two n-bit numbers, a - b. Now suppose that during this subtraction the n-bit is set to 1, which is what 1 << n does. Note that the most significant bit of an n-bit number is (n-1)'th bit, as we are speaking in zero-indexed terms.

If a is less than b, then during this subtraction a will have to borrow that n'th bit. Otherwise, it is large enough that b can be subtracted without needing to borrow. To see if a < b we can simply check if n'th bit is borrowed or not during a subtraction. Since n'th bit will be 0 if it is borrwed, we must do a logical NOT operation (simply subtract the result from 1) as we assign it to the output.

The role of assert(n <= 252) is to ensure we can safely set n'th bit to 1. That would be a problem if n were to be greater than 252, as that is where the order of scalar field of our curve (BN254 in this case) is cut-off.

LessEqThan

template LessEqThan(n) {
  signal input in[2];
  signal output out;

  out <== LessThan(n)([in[0], in[1]+1]);
}

This is because .

GreaterThan

template GreaterThan(n) {
  signal input in[2];
  signal output out;

  out <== LessThan(n)([in[1], in[0]]);
}

This is because .

GreaterEqThan

template GreaterEqThan(n) {
  signal input in[2];
  signal output out;

  out <== LessThan(n)([in[1], in[0]+1]);
}

This is because .

Constant Comparisons

Circomlib comes with a more efficient method when we are comparing one signal with another constant. In this section, we will explain the constant-comparison method along with several of its usages.

These methods depend on the prime field used within the circuit, and therefore re-using these same circuits may cause bugs due to hard-coded values in other prime fields!

CompConstant

TODO: the circuit code is written but I cant yet explain how it works...

AliasCheck

template AliasCheck() {
  signal input in[254];

  component comparator = CompConstant(-1);
  for (var i=0; i<254; i++) {
    comparator.in[i] <== in[i];
  }
  comparator.out === 0;
}

Alias check simply asserts that a given 254-bit number is within the prime field of BN254. The role of -1 there is just a short-cut to obtain the largest number within the field.

Sign

template Sign() {
  signal input in[254];
  signal output sign;

  var half = 1 / 2;
  component comparator = CompConstant(half - 1);
  for (var i = 0; i < 254; i++) {
      comparator.in[i] <== in[i];
  }

  sign <== comparator.out;
}

In a prime field, a field element is defined to be positive if it is closer to 0 than it is to the order of the field. Sign checks for that property and returns 0 if the number is positive, otherwise 1 if the number is negative. Specifically, for order and a number the sign of is:

  • positive when
  • negative when

Range Checks

A range-check is a common utility, asserting that a number is in some range [MIN, MAX]. It is important to keep in mind that you can make use of bit-decompositions to check for range [0, 2^n), which would be done by simply checking if a number is representable with n-bits.

Lookup-tables are often used for range checks in other zkDSLs, but I dont yet know how to use them in Circom.

AssertInRange

template AssertInRange(MIN, MAX) {
  assert(MIN < MAX);
  signal input in;

  var b = nbits(MAX);

  component lowerBound = AssertBits(b);
  lowerBound.in <== in - MIN;

  component upperBound = AssertBits(b);
  upperBound.in <== in + (2 ** b) - MAX - 1;
}

Above is one way of doing a generic range check. Our approach here is to check:

  • in - MIN is a b-bit value
  • in + 2^b - 1 - MAX is a b-bit value

where b is the minimum number of bits required to represent MAX. The nbits function is described within the bits section.

If in < MIN, the operation in - MIN will underflow and we will have a huge number, definitely not be b-bit representable (assuming a not-so-large MAX). As an edge case, if in == MIN we get 0, which is definitely b-bits.

If in > MAX, the operation in + 2^b - 1 - MAX becomes larger than 2^b - 1, which is not b-bit representable. As an edge case, if in == MAX we get 2^b - 1, which is the largest b-bit number.

Control Flow

In the world of arithmetic circuits, control flow is a bit different than what most developers are used to. In normal programming, only one of the branches are executed:

if (cond) {
  foo(); // only if cond == true
} else {
  bar(); // only if cond == false
}

That is not how things work in our circuits! Instead, both branches are executed but only one of them is "returned".

To understand why, first remind yourself that we are ALWAYS working with integers in our circuits. Both foo() and bar() will return an integer, and cond is either 0 or 1. The trick is to combine these results with cond and NOT(cond) in a simple sum expression:

cond*foo() + NOT(cond)*bar();

Opening this up, we have:

cond*foo() + (1-cond)*bar();

In fact, we can use one multiplication instead of two, using the following form:

cond*(foo() - bar()) + bar();

IfElse

template IfElse() {
  signal input cond;
  signal input ifTrue;
  signal input ifFalse;
  signal output out;

  out <== cond * (ifTrue - ifFalse) + ifFalse;
}

Following our description above, an if-else is defined by the given circuit. Remember that both ifTrue and ifFalse signals are computed, regardless of the condition, but only one of them is returned.

Conditional Constraints

In your application, there may be cases where ifTrue and ifFalse can't be valid at the same time, due to some condition; ifTrue might constrain a signal to be 0 but ifFalse might constrain that same signal to something else.

In these cases, we may add an auxillary signal which can enable/disable a constraint check. As an example, imagine the following template that constraints a signal to be 5:

template IsFive() {
  signal input in;

  in - 5 === 0;
}

We can add an auxillary isEnabled signal that is either 0 or 1, and disable this circuit when isEnabled = 0 as follows:

template IsFive() {
  signal input in;
  signal input isEnabled;

  isEnabled * (in - 5) === 0;
}

If isEnabled = 1 both circuits are equivalent, but if isEnabled = 0 this circuit will always have 0 === 0 regardless of the in signal.

Circomlib has a circuit that makes use of this trick, see ForceEqualIfEnabled.

Switch

template Switch() {
  signal input cond;
  signal input in[2];
  signal output out[2];

  out[0] <== cond * (in[1] - in[0]) + in[0];
  out[1] <== cond * (in[0] - in[1]) + in[1];
}

It is often useful to switch the places of two signals based on a condition, which can be achieved with two IfElse lines together.

You can do this with a single multiplication too!

template Switch() {
  signal input cond;
  signal input in[2];
  signal output out[2];

  signal aux <== (in[1] - in[0]) * cond;

  out[0] <==  aux + in[0];
  out[1] <== -aux + in[1];
}

Multiplexing

TODO

Arrays

Circom arrays are a different kind of beast. The main reason is that in Circom, array operations must be known at compile-time. What do we mean by this:

  • Array sizes are fixed, e.g. you can't define an array based on the user input after compiling the circuit.
  • Array indexing should be known at compile time, e.g. you can't ask a user for index i and return arr[i] like you normally do.

Before we get to the problematic unknown-at-compile-time stuff, let's quickly recap the known-at-compile-time array operations:

// an array with N elements
signal arr[N];

// a multi-dimensional array of size N * M
signal arr[N][M];

// read element at index i for known i
foo <== arr[i];

// write to element at index j for known j
arr[j] <== bar;

So now, what if we want to read or write to an index unknown at compile time?

ArrayRead

template ArrayRead(n) {
  signal input in[n];
  signal input index;
  signal output out;

  signal intermediate[n];
  for (var i = 0; i < n; i++) {
    var isIndex = IsEqual()([index, i]);
    intermediate[i] <== isIndex * in[i];
  }

  // adder from "Arithmetics" section
  out <== Adder(n)(intermediate);
}

To read an unknown index, we could instead read ALL signals (which is an operation known-at-compile-time) and then return a linear combination of them, with each value multiplied with an equality-check with our index.

arr_i =
  A[0]   * (i == 0)
+ A[1]   * (i == 1)
+ ...
+ A[n-1] * (i == n-1)

This way, our array accesses are known at compile-time but we are still able to get the value at index i, albeit at the cost of some constraints.

arr_i =
  A[0]   * 0
+ A[1]   * 0
+ ...
+ A[i]   * 1
+ ...
+ A[n-1] * 0
= A[i]

In the paper xJsnark, this method is referred to as "Linear Scan".

ArrayWrite

template ArrayWrite(n) {
  signal input in[n];
  signal input index;
  signal input value;
  signal output out[n];

  for (var i = 0; i < n; i++) {
    var isIndex = IsEqual()([index, i]);
    out[i] <== IfElse()(isIndex, value, in[i]);
  }
}

Writing to an unknown-index works in a similar way to reading one. The idea is to simply copy the input signal array to an output array, but only at the index i we will use our new value instead of the existing one at in[i].

We have defined the IfElse template under the Control Flow section.

Distinct

We may often require to check if an array has non-repeating values, also known as unique values or distinct values. A common example of this would be for a Sudoku circuit, e.g. you would like to assert that all values in a row are in range [1, 9] and that the row has distinct values.

AssertDistinct

template AssertDistinct(n) {
  signal input in[n];

  var acc = 0;
  for (var i = 0; i < n-1; i++){
    for (var j = i+1; j < n; j++){
      var eq = IsEqual()([in[i], in[j]]);
      acc += 1 - eq;
    }
  }

  acc === n * (n - 1) / 2;
}

To assert that an array has distinct values, we can loop over the values and check each unique pair to be non-equal using the IsEqual template. For example, in an array of 4 elements this corresponds to the following checks:

IsEqual()([in[0], in[1]]) === 0
IsEqual()([in[0], in[2]]) === 0
IsEqual()([in[0], in[3]]) === 0
IsEqual()([in[1], in[2]]) === 0
IsEqual()([in[1], in[3]]) === 0
IsEqual()([in[2], in[3]]) === 0

We could have a constraint on each iteration, but instead we can sum all the results and see if all were zero in one constraint, as done with the acc variable above.

IsDistinct

template IsDistinct(n) {
  signal input in[n];
  signal output out;

  var acc = 0;
  for (var i = 0; i < n-1; i++){
    for (var j = i+1; j < n; j++){
      var eq = IsEqual()([in[i], in[j]]);
      acc += eq;
    }
  }

  signal outs <== acc;
  out <== IsZero()(outs);
}

If you would like to return a bit-signal based on whether an array has all distinct values or not, you can slightly modify the AssertDistinct template to obtain that functionality. Instead of asserting each IsZero check, we accumulate them and then return whether that final sum is equal to zero or not.

Note that technically it is possible for acc to overflow and wrap back to 0, however, that is unlikely to happen given how large the prime-field is and we would need that many components to be able to overflow using 1+1+...+1 only.

IsSorted

template IsSorted(n, b) {
  signal input in[n];
  signal output out;

  var acc = 0;
  for (var i = 1; i < n; i++) {
    var isLessThan = LessEqThan(b)([in[i-1], in[i]]);
    acc += isLessThan;
  }

  signal outs <== acc;
  var outsIsZero = IsZero()(outs);
  out <== 1 - outsIsZero;
}

If we need an array to be sorted, we could instead sort the array out-of-circuit and pass in the sorted array, finally asserting that it is sorted indeed. To do this, we can simply check that consecutive elements are ordered, that is for all .

AssertSorted

template AssertSorted(n, b) {
  signal input in[n];

  // accumulator for in[i-1] < in[1] checks
  var acc = 0;
  for (var i = 1; i < n; i++) {
    var isLessThan = LessEqThan(b)([in[i-1], in[i]]);
    acc += isLessThan;
  }

  acc === n - 1;
}

If you would like to assert that the array is sorted instead of returning 0 or 1, you can simply check that acc === n-1 at the end. This is because we make comparisons and accumulate all of them within acc variable. If all passes check, that should sum up to .

Hashing

Poseidon

MiMC

Merkle Trees

If you have been in the world of crypto for a while, it is highly likely that you have heard the term Merkle Tree, also known as Merkle Hash Tree. A Merkle Tree is a hash-based data structure, and can serve as a cryptographic commitment scheme.

You can commit to a set of values using a merkle tree, such as:

  • Evaluations of a function
  • Coefficients of a polynomial
  • Files in your database

Here is an example, where we commit to a vector using a Merkle Tree:

graph BT
  subgraph Merkle Tree
  h1["h1 (Merkle Root)"]
	h2 --> h1;  h3 --> h1
	h4 --> h2;  h5 --> h2
  h6 --> h3;  h7 --> h3
  h8 --> h4;  h9 --> h4
  h10 --> h5; h11 --> h5
  h12 --> h6; h13 --> h6
  h14 --> h7; h15 --> h7
  end

  m -- hash --> h8
  y -- hash --> h9
	v -- hash --> h10
  e -- hash --> h11
  c -- hash --> h12
  t -- hash --> h13
  o -- hash --> h14
  r -- hash --> h15

In a Merkle Tree, every node is made up of the hash of its children. In this example binary tree, that would be:

  • and so on.

The leaf nodes are the hashes of elements of the committed set of data. The final hash at the root of the tree is called the Merkle Root.

Merkle Trees are often implemented as binary trees, but the concept works for -ary trees as well, where each node has children.

Merkle Proof

At some point, we may be asked to show that indeed some element of the comitted data exists at some position. For our example above, a verifier could ask "is there really a at position 6?".

A naive method here would be give the entire comitted set of data, and let them prove the Merkle Root; however, we can do much better than that! With Merkle Trees, we can answer such queries in a much more efficient way, and without revealing any other data than the one we are asked to reveal.

The trick is to provide the hashes needed to compute all the way from the requested element up to the Merkle Root. In total, we only need to provide one hash per level (in the case of a binary-tree Merkle Tree) and the verifier can compute the root!

For instance, to show that we have at position 6, we need to provide the hashes that are used to compute the parent nodes.

  • is computed by verifier.
  • requires to be provided.
  • requires to be provided.
  • requires to be provided.
  • is our commitment, which the verifier knows already.

The proof is visualized below for the same tree, where the values known & computed by the verifier are colored green and the values provided by the prover are colored blue:

graph BT
  classDef g fill:#afa
  classDef b fill:#aaf

  subgraph Merkle Tree
  h1["h1 (Merkle Root)"]:::g
	h2:::b --> h1
  h3:::g --> h1
	h4[fa:fa-eye-slash] --> h2
  h5[fa:fa-eye-slash] --> h2
  h6:::g --> h3
  h7:::b --> h3
  h8[fa:fa-eye-slash] --> h4
  h9[fa:fa-eye-slash] --> h4
  h10[fa:fa-eye-slash] --> h5
  h11[fa:fa-eye-slash] --> h5
  h12:::b --> h6
  h13:::g --> h6
  h14[fa:fa-eye-slash] --> h7
  h15[fa:fa-eye-slash] --> h7
  end

  t:::g -- hash --> h13

You see, we only needed to provide 3 hashes here, although our data had 8 elements! In fact, if you have elements you only need to provide elements to the verifier, this is so much more efficient than the naive method of sending all the data to the verifier.

As a Commitment Scheme

A Merkle Root can serve as a cryptographic commitment to a set of data.

  • It is hiding because you can't find the preimage of an hash efficiently.
  • It is binding because assuming otherwise would require you to find a hash-collision efficiently, which is known to be intractable.

To reveal that some value is part of the comitted set of data at a specific point, you only need to reveal the path from that node to the root, along with the value itself, as described above.

Further Reading

Sparse Merkle Tree

In our description of the Merkle Tree, we had a binary-tree with 8 leaf nodes, and we had filled all of them with elements of a vector . What if we did not have data to fill the entire leaves?

Advanced

The world of Circom is vast, and here we take a look at some more advanced concepts.

Alternative Provers

SnarkJS is not the only option to prove Circom circuits, you can check out some alternative provers in other languages as well:

  • Go-Rapidsnark is a collection of Go packages by Iden3, that allow you to compute witnesses, generate, or verify Circom proofs.
  • Ark-Circom has Arkworks bindings to Circom's R1CS for proof & witness generation.
  • Lambdaworks has Circom adapters for Groth16.

However, these are not as capable as SnarkJS, e.g. they only support Groth16 protocol.

Large Circuits

If you have very large circuits (e.g. >20M constraints) you will have some practical problems, most notably the limited memory of your machine. There is a great HackMD post that describes what the best practices are for large circuits, see here.

Folding

Folding is when a "well-structured" circuit with many iterations is folded into a single "iteration" from which a proof is generated that proves all iterations are correct. See more at awesome-folding.

  • A recent work from PSE called Sonobe paves the way to fold Circom circuits.

  • Nova-Scotia is a middleware to compile Circom circuits for the Nova prover.

Just Logging

If you would like to experiment with some circuit code quickly, zkRepl is usually the best option. In any case, if you still want to play locally you can simply write logs in your code and compile your circuit with:

circom ./your-circuit.circom --verbose

This will not emit any build artifacts, it will only print logs.

Simple

Fibonacci

template Fibonacci(n) {
  assert(n >= 2);
  signal input in[2];
  signal output out;

  signal fib[n+1];
  fib[0] <== in[0];
  fib[1] <== in[1];
  for (var i = 2; i <= n; i++) {
    fib[i] <== fib[i-2] + fib[i-1];
  }

  out <== fib[n];
}

Calculating n'th Fibonacci number is yet another "Hello World!" of Circom. We simply constrain the starting numbers (given as input in) and assert the recurrence relation

Here is how computing the 4'th Fibonacci number looks like:

flowchart LR
  in_1(("in[1]")) --> f1["fib[1]"]
  in_0(("in[0]")) --> f0["fib[0]"]
  f1 --> f2
  f0 --> f2["fib[2]"]
  f2 --> f3
  f1 --> f3["fib[3]"]
  f3 --> f4
  f2 --> f4["fib[4]"]
  f4 --> out(("out"))

You can also implement Fibonacci recursively in Circom!

template FibonacciRecursive(n) {
  signal input in[2];
  signal output out;

  if (n <= 1) {
    out <== in[n];
  } else {
    var prev = FibonacciRecursive(n-1)(in);
    var prevprev = FibonacciRecursive(n-2)(in);
    out <== prev + prevprev;
  }
}

However, the constraint count will be much higher, because this is the "stupid" recursive way where we are not memoizing. In fact, the constraint counts themselves will be a Fibonacci number in this circuit.

MagicSquare

TODO

Intermediate

Sudoku

TODO