Privacy Pools- Compromise between perfect privacy and perfect observability

Published in

Coinmonks

13 min readOct 10, 2023

On the question of “Privacy vs. Transparency”, I remember a comment from Brian Brooks at a conference last year — “We as a society need to become better at balancing two equally important but contradictory priorities”.

Protecting the privacy of digital transactions is a prime concern but we also cannot turn a blind eye to the fact that the privacy afforded by crypto tools has led to crypto being used for money laundering and financing terrorist activity.

The crypto tool most often leveraged for such activities was Tornado Cash.

$100mn Harmony Horizon Bridge hack in Jun’22, laundered using Tornado Cash
$150mn Compound Finance Hack in Jun’20, laundered using Tornado Cash
$326mn Wormhole hack in Feb’22, laundered using Tornado Cash
$600mn Axie Infinity hack in Mar’22 executed by North Korea’s government-sponsored hacking group- Lazarus, laundered using Tornado Cash

So how do we balance these two equally important but contradictory priorities of preserving privacy and preventing money laundering?

“Privacy Pools” may be the answer. A whitepaper coauthored by Vitalik Buterin, Jacob Illum, Matthias Nadler, Fabian Schar and Ameen Soleimani discusses this in detail.

But let us start at the beginning.

Privacy is necessary and public blockchains currently do not support it

Public blockchains are by design fully transparent. If you make an ETH payment to me, I can go on Ethereum Blockchain Explorer and see the current balance and track all past/future payments to/from your address.

The information can be used to create a profile of you by those trying to target you with ads, by governments deciding on your social score, or by general busybodies tracking everything from your political affiliations to views on birth control.

The argument that only criminals want privacy is flawed. Regular, law-abiding citizens also deserve privacy. And the open architecture of Bitcoin does not support privacy.

Does this mean that if one wants decentralization then one must give up on privacy and if one wants privacy then one must give up on decentralization?

Pseudonymity is not real privacy

Bitcoin transactions are pseudonymous i.e. while all transactions are visible to everyone, the executors of transactions are 26–35 character wallet addresses. Satoshi Nakamoto seems to have naively assumed that these on-chain addresses would not be linked to real-world identities.

This assumption is clearly incorrect in a world equipped with sophisticated forensic tools.

Privacy Coins

The first approach to handle the problem was privacy coins, which are able to obscure the addresses of payers and recipients as well as the value of transactions.

Monero and ZCash are the two predominant privacy-promoting blockchains with the native tokens XMR and ZEC respectively. Monero achieves privacy through RingCT and Stealth addresses and ZCash achieves the same by shielding addresses with zero-knowledge proofs.

While this is a good start, it provides privacy only for transactions executed in privacy coins i.e. to enjoy privacy, you would be forced to limit your transactions to privacy coins like XMR and ZEC.

Is there a way to achieve privacy irrespective of which cryptocurrency one chooses to transact with? That is where mixers come in.

Mixers

The way it works is that if you have 1ETH and don’t want your neighbors to know how you spend that 1ETH, you can add your 1ETH to a pool where thousands of people have added 1ETH each. Next, you and the other 1000+ depositors withdraw ETH from the pool to completely new wallet addresses. This way an observer cannot know which withdrawal address is yours.

How does one practically achieve this though?

If you could find 1000+ people who you trust to keep your information secret then setting up such a pool would be easy. Only the people in the pool would know each other's identity.

=> But this is not real security. If even one of these 1000+ random people decides, they can reveal your transaction details to all and sundry.

What if you relied on a trusted intermediary to manage the Mixer? Only the administrator of the Mixer would know the deposit and withdrawal addresses of the participants.

=> This setup is not secure either because it requires you to trust an intermediary to (a) not steal your tokens and (b) destroy all identifying information post-transaction.

Is there a trustless solution where one would not have to share deposit and withdrawal information with anyone?

Tornado Cash

Tornado Cash mixes funds without participants having to reveal to anyone which deposit and withdrawal addresses belong to them.

Technically, this is difficult to achieve. Public blockchains are “public” which means that the data you enter is viewed by everyone. So how can anyone hide the link between their deposit and withdrawal address?

Tornado Cash achieves this through the use of a combination of (a) Hashing (b) Merkle Trees and (c) Zero-Knowledge Proofs.

We’ll discuss these terms soon but let’s first consider why regulators don’t like Tornado Cash.

The problem with Tornado Cash is that while it preserves legitimate privacy needs, it can also be exploited by terrorists and drug dealers to launder money and obscure the source of funds.

Is there a way to achieve privacy for legitimate purposes and deny privacy for malicious purposes?

Separating Equilibrium: Voluntary disclosure to trusted parties

One solution suggested by the Federal Reserve Bank of St. Louis in the paper “Tornado Cash and Blockchain Privacy”, is to have a “separating equilibrium” between honest and dishonest actors. Good actors who use Mixers for legitimate privacy needs, have the option to voluntarily disclose the source of funds to certain entities. This way their transactions would be shielded from everyone except those entities they’re okay to share the information with.

So if I’ve paid you for services rendered and you don’t want me to observe how you use those funds in the future, you might choose to put those funds into a mixer. If you later choose to deposit those funds in a bank, the bank will inquire about the source of funds. You can provide cryptographic proof to your bank to reveal exactly where your funds came from i.e. from me, thus assuring the bank that your funds came from legitimate sources.

If the source of funds was a known sanctioned address or an illegal source like a protocol hack, you wouldn’t provide such cryptographic proof since you’d essentially be admitting that the funds were tainted.

Game theory will ensure that only good actors use mixers.

This approach though interesting is not free of problems:
* It requires you to trust the bank with exact transaction details. Even if you believe that the bank is trustworthy, the whole point of blockchain is that you shouldn’t have to trust anyone.
* You will be sharing my wallet details with your bank and I might not be comfortable with that.
* The bank becomes a single point of failure. An entity sitting on such valuable information is a juicy attack vector for hackers.

What is needed is a way for the bank to verify that the funds came from legitimate sources without revealing the exact source.

Privacy Pools

This finally brings us to Privacy Pools as elaborated by Vitalik Buterin, Jacob Illum, Matthias Nadler, Fabian Schar and Ameen Soleimani in their recent whitepaper. Privacy Pools build on the “separating equilibrium” concept discussed above.

With privacy pools, you don’t need to disclose exactly which deposit address your funds came from. Instead, you prove that your withdrawals either:

Came from a subset of deposits known to be clean through the use of “membership proofs”, or
Did not come from deposits from known sanctioned addresses through the use of “exclusion proofs”.

This way you don’t disclose the exact source of funds to anyone, but you do assure them that your funds have come from legitimate and honest sources.

After reading about the need for privacy, blockchain’s unique characteristics that make privacy difficult, and several approaches to privacy from privacy coins to privacy pools, in case you’re curious about the technicals of how privacy is achieved using cryptography, please keep reading.

Technicals

Starting with the 3 basic building blocks:

Hashing
Merkle Trees
Zero-Knowledge Proofs (ZKP)

1. Hashing

Hashing works as follows:

Take any input
Run it through a hashing algorithm (There are several hashing algorithms. Bitcoin uses SHA-256 and Ethereum uses Keccak-256.)
Generate an output which is called the “hash” of the input.

The magic of a good hashing algorithm is:

Even for large inputs, the hash will be of a fixed manageable size. In the case of SHA256, the output is 64 characters long.
A particular input will always produce the same hash.
Even a minor change to the input will lead to a completely different hash.
If you know the hash, you cannot reverse engineer and re-create the input

If every blockchain node had to store all data of all transactions ever executed on the blockchain, then the required storage capacity would be excessive. Thanks to hashing, nodes only have to store hashes of transaction data.

2. Merkle Trees

With Merkle Trees, it gets even better. Nodes don’t even have to store the hash of each transaction. They only need to store 1 hash per block that serves as a summary of all the transactions in the block.

This is how it works.

First, transaction data is hashed. The original transaction data hashed are called “leaves”. Next, leaves are paired and hashed again and again till there’s only 1 hash left and there’s nothing to pair it with. This final hash is called the Merkle Root.

While “full nodes” store the entire transaction history, “light nodes” only store the Merkle Root (I’m oversimplifying. They also store the previous hash, timestamp, block version, the difficulty target, nonce, etc. but suffice to say that they store a lot lesser data than “full nodes”)

If later any transaction is added, removed, or altered, the hash of that data will change and the change will trickle up to the Merkle Root. All policing nodes will notice that the blockchain has been tampered with.

If a “light node” wants to verify a particular transaction, it can regenerate the Merkle Root by requesting the relevant information from the “full node”.

Let’s say that a “light node” wants to verify transaction D (Tx D).

The light node will hash transaction D herself to determine H(d). The full node will send H(c), H(ab), and H(efgh) with which the light node will recalculate the Merkle Root. By hashing H(c) sent by the full node and H(d) the light node calculates H(cd). By hashing H(ab) sent by the “full node” and H(cd) just calculated, the light node calculates H(abcd), and finally by hashing H(abcd) just calculated and H(efgh) sent by the full node, the light node calculates the Merkle Root. If the recalculated Merkle Root matches the original, it means that Transaction D is valid.

The additional values provided by the “full nodes” are called the “Merkle Proof”

3. Zero-Knowledge Proofs

The point of zero-knowledge proofs is to prove that you know something without revealing what exactly you know.

A former colleague of mine claimed that she knew the value of pi to 100 decimal places. The easiest way to prove this claim was to recite pi to 100 decimal places. But what if she didn’t want to reveal the value of pi but wanted me to believe that she knew it anyway? She could potentially have done this with zero-knowledge proofs.

Another example would be that if you want to participate in an activity that requires you to be 18 years old, the easiest way to prove that you qualify is to produce an ID like a passport that states your date of birth. However, by showing your passport, you’ll be revealing not only that you’re over 18 but also your exact age, date of birth, nationality, place of birth, etc. This is a privacy concern. If today is 10-Oct-2023, then the person deciding if you can participate in the activity needs to know only if you were born on or before 9-Oct-2005. Your exact age, sun sign, and country of birth are none of her concern.

Similarly, when it comes to privacy on-chain, authorities have to right to require that your funds came from legitimate sources and that you’re not involved in illegal activities, but you should not be forced to reveal your complete transaction history.

The earlier versions of zero-knowledge proofs were interactive i.e. there was a lot of back-and-forth communication between the prover and verifier to confirm that the prover had the knowledge that they were claiming to have. Two balls and a color-blind friend is a good example to understand how it works.

Interactive ZKPs are not practical though. Tornado Cash uses a more advanced version of ZKPs called ZK-SNARKS (Zero Knowledge, Succinct, Non-Interactive Argument of Knowledge)

Zero-Knowledge: Proves that the prover has the required knowledge without revealing what that knowledge is
Succinct: Verifying the proof takes less time than if the prover just sent the secret knowledge and the verifier had to check if it generated the right output.
Non-Interactive: The proof requires only one communication from the Prover to the Verifier. No back-and-forth communication between the two entities is necessary.
Argument of Knowledge: The prover can construct proof that she is aware of the secret information

This is how it works:

If there’s a secret code that only you know, you input that secret code (+ a publicly known key + proving key) into a zero-knowledge algorithm and it generates a Zero-Knowledge Proof.

Note that the Proving key is the one input that requires trust. It is generated through a trust ceremony involving multiple participants where each participant contributes an input and then discards that input. If someone is aware of all the contributed inputs then they can generate fraudulent proofs. While there is an effort to remove the need for a trust ceremony, the situation isn’t as dire as it might sound because one needs to know every single contributed input to create fraudulent proofs. If even one participant discards their input, no fraudulent proofs can be generated.

A validator by inputting the publicly known input (+validation key) into a different algorithm can validate your Zero Knowledge Proof.

Note that the verifier cannot regenerate the secret code but can only verify that you know the secret code.

Tornado Cash

With that background, let's evaluate how Tornado Cash applies Hashing, Merkle Trees, and Zero-Knowledge Proofs to enable on-chain privacy.

With Tornado Cash, the withdrawer has to prove that one of the deposits into Tornado Cash is theirs without revealing which one.

Hashing — Tornado Cash uses two hash functions — Pederson (H¹) and MiMC (H²)

Deposit — When you deposit funds into the Tornado Cash contract, in addition to specifying the asset and value you’re depositing, you also provide a commitment scheme (C) that will be added to the left-most empty leaf of a Merkle Tree of height 20.

The commitment is the Pederson hash function of the concatenation of two random numbers, k (nullifier) and r (secret).

C=H¹(k||r)

C and the address of the depositor are revealed. k and r are hidden.

Proving knowledge of k and r will be essential to withdraw the funds. Hence, it must be stored safely. If you lose them then you’ll be unable to withdraw the funds and if someone else lays their hands on these values then they can steal your funds.

Tornado Cash’s trust ceremony to generate the proving key has 1,114 participants of which 450 reveal their identity and 664 are anonymous. If all entities are anonymous then the protocol would be vulnerable to a Sybil Attack i.e. one person who creates multiple accounts could control the process of generating the proving key. On the other hand, if all entities were known then they would be vulnerable to coercion and blackmail. Recall that to generate fraudulent proofs, one needs every single contribution by every single participant of the trust ceremony.

Withdrawal — To withdraw you must call the withdraw() function of Tornado Cash where you provide

(i) a newly created withdrawal address and

(ii) Proof that your Commitment hash (C) is part of the Merkle Tree with root R. By providing the Proof, you showcase that you know the values of k and r and the position of your C among the leaves without revealing these values or pinpointing any particular C. Your withdrawal therefore cannot be linked to any particular deposit.

(iii) The next technical challenge is that if you don’t reveal which deposit your withdrawal relates to, you might withdraw multiple times. To stop that from happening, in addition to providing proof that the Merkle Tree contains your commitment hash, you also provide a hash of the “nullifier” [H¹(k)]. Post your withdrawal, this is saved by the smart contract. There’s only 1 nullifier for each commitment hash. Hence, once you’ve used the nullifier, it cannot be used again to withdraw funds. Thus a double-spend is avoided.

Separating Equilibrium

Tornado Cash confers the ability to prove that one of the deposits in a Tornado Cash Pool is yours without specifying which one.

The Separating Equilibrium Solution suggested by the Federal Reserve Bank of St. Louis confers the ability to create a proof to showcase exactly which deposit into Tornado Cash is yours. You can share that with whomever you choose.

Privacy Pools

When using Privacy Pools, you continue to use Tornado Cash-like contract as normal to break the link between your deposits and withdrawals. Additionally, you provide proof to chosen entities that your funds came from a more restricted “association set”. This restricted association set is either (a) a bunch of deposits you know are clean or (b) excludes all deposits from all known blacklisted addresses.

If “f” is your deposit, you’ll provide two Merkle Proofs:
1. That “f” is part of the complete Merkle Tree with root R
2. That “f” is part of a Merkle Tree of a subset of deposits with root Ra. The deposits in this tree are those that you’re confident are clean or expressly exclude all known tainted deposits.

The user experience will be less than ideal if they have to pick and choose which deposits are included in the subset tree. One would expect this role to be taken on by a new intermediary called Association Set Provider (ASP).

As of now, Tornado Cash supports only fixed-sized deposits. Eg: you can only deposit 1 ETH into a 1 ETH pool. This is to provide added protection. If you were depositing 3.345678 ETH from one address and withdrawing 3.345678 ETH into another address, it would be fairly easy to link the two. Another way in which Privacy Pools are likely to be superior to Tornado Cash is that they aim to support arbitrary-size deposits and withdrawals. Eg: you could deposit 15 ETH and withdraw 1, 2, 3, 4, and 5 ETH into 5 different addresses.

Conclusion

Solutions like Privacy Pools showcase that crypto is no longer being led by anarchist cowboys and that there’s a genuine attempt to wipe out malicious practices while retaining the core ideals.

Privacy Pools- Compromise between perfect privacy and perfect observability

Privacy is necessary and public blockchains currently do not support it

Pseudonymity is not real privacy

Privacy Coins

Mixers

Tornado Cash

Separating Equilibrium: Voluntary disclosure to trusted parties

Technicals

1. Hashing

2. Merkle Trees

3. Zero-Knowledge Proofs

Tornado Cash

Separating Equilibrium

Privacy Pools

Conclusion

Written by Tiena Sekharan