How to make prediction markets anonymous?

8 min readJul 2, 2023

“I will say that the plan of the young Mouse is very good. But let me ask one question: Who will bell the Cat?” — Aesop

Recapping the rationale.

Reputation systems are essential to the smooth operation of modern social systems and structures as they are increasingly becoming impersonal. Reputation systems are essential to putting the Web3 industry on a healthy footing, restoring the public trust, and developing towards a peaceful coexistence while retaining one’s sovereignty.

A particular group that badly needs a reputation standard are crypto content creators, the middlemen of the industry, responsible for on-boarding and educating ordinary users. The absence of a quality curation mechanism in the crypto content creator space opens the door to scams and manipulation, alienates the public, and impedes the industry’s drive towards mass adoption.

On-chain prediction markets are a potent tool that can be used to assess individual competency. However, their major limitation in this regard is excessive transparency, leaving them vulnerable to exploitation by dishonest or malicious actors. Classic on-chain prediction markets like Polymarket and Augur enable users to see each other’s predictions at will and copy the best predictors. As a result, the value of each prediction gets diluted and the data we garner about individual predictors, crypto content creators in our case, are extremely noisy.

Anonymity makes prediction replication mathematically infeasible and thus prevents the devaluation of predictions and makes the data less noisy.

What kind of anonymity.

So far, my use of “anonymity” has been rather handwavy. The purpose of this and the following section is to explain the term in more precise terms and foreground our subsequent discussion of the methods to attain anonymity.

In the context of prediction markets, there are essentially two kinds of anonymity:

Agent (address) anonymity.
Action anonymity (commonly known as privacy)

A large volume of research has been produced on implementing both in the context of blockchains. In some cases, such as DAO voting, both are needed. In voting, the secret ballot plays a crucial role in preserving the independence and integrity of the voting process making it impossible to pressure the voter into selecting an outcome against their will. Privacy, meanwhile, is important because anything on-chain is publicly visible which makes it easy for people to bandwagon rather than make decisions about the case themselves.

In DeFi applications, there is similarly a case to be made for both anonymity and privacy due to the high sensitivity of financial markets and numerous malicious actors or, at the very least, actors with unknown motivations. Hence why privacy applications and chains, such as Railgun, Aztec, or Tornado Cash, enable both kinds of anonymity.

Conversely, in prediction markets, agent anonymity is arguably less important, as long as we prevent people from blindly copying the concealed prediction (i.e., the hash) of an expert account.

Permanence and temporariness offers us an additional angle for analyzing anonymity. DAOs and especially financial markets make a strong case for permanent anonymity and/privacy due to their sensitivity and high cost of information leakage even ex-post.

In contrast, in reputation systems based on prediction markets, we only really care about anonymity and/or privacy during the vote; once the correct outcome has been identified, we actually want the track record of who predicted what to be publicly available.

Desirable characteristics of anonymity and/or privacy in prediction markets.

Replication resistance — it is infeasible to copy someone else’s prediction before the prediction market is resolved. Replication resistance preserves the value of individual predictions and thus allows us to identify competence.
Sybil resistance — it is infeasible for one address to submit multiple predictions in the same market. Sybil resistance ensures that users cannot game the prediction market by hedging their bets so to speak. Hedging one’s best is a particularly tempting strategy in Pythia, where users do not put money on the line and receive Reputation Tokens for correct predictions.
Usability — the anonymized prediction markets platform should not impose significant overhead costs on the participants compared to non-anonymized prediction markets platforms. Reputation systems only make sense if they are used by a large number of people. Usability is therefore key.

ZKP approach.

The much hyped ZKPs perhaps first come to mind when thinking of making anything private or anonymous. zk-STARKs and zk-SNARKs are the two most widely used implementations of zero knowledge tech. But are they the best solution for prediction market-based reputation systems? Let’s see.

ZKPs are a cryptographic primitive that allow a statement to anyone without revealing any information about the statement other than its truthfulness. The chart below demonstrates a hypothetical implementation of a zk-SNARK.

zk-SNARKs are cheaper than zk-STARKs and are more popular within the ecosystem at the moment being used by projects like Aztec, zkSync, and Scroll.

A zk-SNARK consists of three algorithms, G, P, and V. G generates a pair of keys, pk and vk; pk is used to generate a proof by being input into P along with the data that the individual wants to prove and some public data. The output of P, the proof, is then input into V alongside the public data, and vk to verify the proof. The key property of zk proof is that verification is faster than proving.

In the context of prediction markets, we can use zk-SNARKs the following way:

A crypto content creator sends the prediction along with the hash of some private data (e.g., private key or signature) to a relayer.
The relayers relays the hash and the prediction on-chain; the information gets recorded in a smart contract.
After the market has been resolved, the content creator, to receive their reward, proves via a SNARK that their address, previously unknown to the public, has made that prediction without revealing the private key.
The protocol (Pythia) verifies the correctness of the proof and grants the reward to the user.

Advantages of the scheme:

Replication resistance — it is infeasible to know who made the prediction (as long as the hash is computationally infeasible to reverse).

Disadvantages of the scheme:

Lack of sybil resistance — as there is no requirement on what private information is being used; unless we somehow force the content creator to use their private or signature, they can send multiple hashes with different private information, other than the signature, and successfully hedge one’s bets. Sybil resistance would require somehow forcing the content creator to use specific private information with the help of commitment schemes. However, this would introduce additional complications to the scheme.
Low usability — SNARKs, and especially STARKs, are very expensive to verify. SNARK verification costs upwards of 300k gas.

To sum up, ZKPs are a powerful tool but they have many disadvantages when it comes to implementing anonymity in prediction markets. In the default case, they do not protect against sybil attacks and are highly costly.

Commitment scheme approach.

Commitment schemes are a cryptographic primitive allowing individuals to commit to a certain secret value with the ability to reveal it later. Commitment schemes offer a weaker form of privacy and anonymity since, for liveness purposes, the user has to reveal the secret.

As mentioned though, this weaker form of privacy and/or anonymity is actually fine in the case of prediction markets-based reputation systems where data about the participants following market resolution should be publicly available and contribute towards a track record.

The key advantage of commitment schemes over ZKPs is that they are orders of magnitude cheaper with a verification overhead of ~3k gas.

The simplest yet flawed way of implementing commitment schemes in our case is as follows:

A content creator sends the hash of their prediction on-chain.
After resolution the content creator reveals their prediction.
The protocol checks the hashed prediction corresponding to the original hash, and sends the content creator their reward.

This approach is not replication resistant, because the small, discrete set of market outcomes and knowledge of the hash allow other content creators to determine what a user has predicted. It is also not-sybil resistant. Yet because we store the fact of the prediction by an address in the smart contract, the scheme is sybil resistant.

The way to make it replication resistant is to require the user to sign a message and only then hash it and send the hash to the smart contract. The content creator will sign with their private key which is unique. This is how the new scheme could look in practice:

The signed message in this case is the prediction and the address of the market where the prediction was made by the crypto content creator. During verification, the content creator will provide the signature along with the prediction and the address of the market. The smart contract, given a signature and inputs, can recover the address of the content creator and compare and validate the message sender in fact corresponds to that address.

Advantages of new commitment scheme:

Replication resistance — it is infeasible to learn anything about the prediction from the hash.
Sybil resistance — a content creator address can only submit one prediction on-chain because we store the address of the account that made the prediction in a smart contract.
Usability — gas overhead of ~3k gas only.

Disadvantages of new commitment scheme:

No major disadvantages; lack of anonymity ex-post compared to the zk approach is actually beneficial in our case enabling anyone to see the crypto content creator’s track record of predictions.

Conclusions.

Prediction markets have the potential to become a powerful reputation tool due to their very natural way of measuring competency based on the accuracy of forward statements.
Some form of anonymity is essential to prediction market based reputation systems to remove the noise from the data about individual participants.
There are several approaches to anonymity in prediction market-based reputation systems: make the predictors anonymous or make the predictions private.
The two main anonymity and privacy techniques on-chain are ZKPs and commitment schemes.
ZKPs offer greater anonymity and privacy guarantees but are expensive and not sybil resistant.
Commitment schemes are much cheaper, provide just the necessary privacy and/or anonymity, and are sybil resistant.

If you enjoyed this article, let me know on Twitter @pneumatic_orcl or email at nikita.kravchenko@pythia.company; also follow us on Medium to learn about our company, product, and upcoming release.

How to make prediction markets anonymous?

Written by Pythia

No responses yet