The Privacy Vs. Accountability Problem: Solving for Sybil Attacks


#1

As the opening thread in this new Sovrin Trust Framework Working Group category here on the Sovrin Forum, this is a particularly meaty one.

Sybil attacks are so well known in distributed, peer-to-peer computing that they have their own Wikipedia page. Essentially the problem is: how do you protected a distributed network from attack if it is open to anyone to join?

All public DLTs (distributed ledger technologies) have to deal with this challenge—Bitcoin’s proof-of-work (PoW) solution has become the best known solution and has generated an entire industry of Bitcoin miners (and huge energy waste).

However Sovrin has a special challenge in that it must prevent Sybil attacks while at the same time give individuals (and all identity owners) the ability to be truly self-sovereign, including the ability to control the privacy of their identifiers.

The ultimate form of this control would be to enable identity owners to be completely anonymous. But perfect anonymity = zero accountability.

That’s why this is potentially the most challenging problem for the Sovrin Trust Framework (at the policy level) and the Sovrin ledger (at the code level). I have prepared a writeup of the problem and four different potential solutions in a Google doc entitled The Privacy Vs. Accountability Problem: Solving for Sybil Attacks. It will be the topic of our Sovrin Trust Framework Working Group webmeeting today. Please do read through it and let’s use this thread to discuss it.


#2

Perfect anonymity = zero accountability, but also, in this context, is something I find somewhat confusing. The goal of an identity network is to establish identity - and the goal of a public, global identity network is to create a means of safely sharing identity information with arbitrary consumers.

In order for an arbitrary consumer of a trust claim to have trust in that statement, it must know something about the identity backing those claims. A set of completely anonymous backers of an identity claim, which itself reveals no information about the identity owner, can not provide information that is globally fungible.

The infrastructure for the identity management, in that case, provides a subnet where only participants within that network, through the mechanism of external agreements, can identity members within their trust-cabal and trust each other. While it may be useful to the individuals, the fact that it is disjoint from the global community suggests that they consume the community resources without participating in the global trust framework.

Let’s say Eric and Eddie have a trust cabal - let’s say DID:eric and DID:eddie exist in the system. Let’s say DID:eric wants to claim he has USD$1M for an investment, and this is backed by DID:eddie. This is only useful to a non-cabal member, say DID:sally, if DID:sally also trusts DID:eddie. Perhaps DID:eddie is very trustworthy, and that DID:eddie has a well established reputation of never deceiving. This is a case where there is cross-talk between cabal and non-cabal members, but it hinges on the integrity of an anonymous DID.

To me - whether that DID is minted by some “Zero Knowledge Minter” or via some direct ledger interaction via PoW or PoS seems insignificant. A Zero Knowledge Minter says “here is a DID and I know nothing about it - i handed it out to some random requestor w/o collecting identity information” - a PoW or PoS minted DID says “here is a DID that no one knows anything about”

Since the trust rests on the reputation of DID:eddie, then whether it comes from a ‘trust anchor guaranteeing anonymity’ or from a direct protocol level action, seems moot.

Also - for pure-anonymous DIDs, the issue seems to me to mostly be one of “spamming the ledger” - and for that, some sort of token/control system needs to be in place to throttle submissions. PoW and PoS achieve that throttling, but at some cost to the performance of the network. Combining PoW and PoS with the ability to mint ‘blocks of ids’ that can then be handed out anonymously strikes me as a good balance.

I am interested in thoughts about this construct and my straw-man argument - keeping in mind that the TFWG meets at 4:00 a.m. local time! :wink:


#3

Eric, thanks for your well-reasoned reply. I agree on several key points:

  1. The vast majority—nearly 100%—of trust in Sovrin identities (DIDs) will come via verifiable claims and relationships, and not on who issued the DID.
  2. Those trust relationships will all be, as you describe, organic and dependent on their own particular trust circles/networks (which in some cases will be represented by formal trust frameworks).
  3. So what the Sovrin Trust Framework needs to provide protection for is the very base layer of the whole network, which is the issuance of new DIDs.

On that last point, I also agree that any combination of the four methods we discuss in the Google doc can work to provide a sufficient throttle. That was also the consensus of the folks on today WG call (which I know is at a bad time—there is no one time that works for everyone). And that was also the input of Phil Windley in the Sovrin Trust Framework Slack channel today.

I am very interested in other’s thoughts about this key issue.


#4

I concur that the focus should be on protection of the network but I think maintaining anonymity is also critical if desired or required by any participating party.

Unless I’m mistaken DIDs are not something that are likely to be created very frequently (at least nowhere near the frequency with which other entry types are likely to be recorded) - they would only be created when bootstrapping a new identity shard. On that basis I think trust models 2, 3 or 4 (or a combination thereof) should be reasonable.

If all records are associated with an existing DID then it is possible to determine whether ‘account’ is spamming the network with other types of record. Once detected they can be cut off, although if the DID is anonymous then identifying the actual attacker remains impossible so there is nothing preventing them from setting up new DIDs to spam from other than the throttling of the creation of those ‘accounts’.

I asked a question earlier on the technical channel about prevention of spamming in general (http://forum.sovrin.org/t/how-does-plenum-discourage-spam/81/2) - should this be considered related?


#5

I think that there are two issues w/ spamming

  1. spamming via DID creation
  2. spamming from a DID via spurious claims

Anonymity, i think, is an issue for (1) above - the creation of DIDs, but not for (2). For (2) the identity that matters is the DID, and, as you point out, misbehaving DIDs can be cordoned off… and such sanctions should be spelled out in the TF legal documents. If I’m not mistaken, for ‘claims related spamming’, the technical solution would rely upon the enforcement of the write-permissions for a given DID. Is a DID constrained to writing ‘through’ a specific trust anchor? How would this be implemented?

But the relationship between (1) and (2) definitely raises some legal concerns (lawyer-folk want to chime in?) - i expect that there is an expectation that a trust-anchor ‘speak for’ the behaviour of the DIDs minted on their behalf? If that is the case, then creating a special legal condition for ‘anonymous dids’ is paramount.

This also raises the question of ‘anonymous DIDs’ - would someone who wishes to generate anonymous DIDs at a protocol level be required to sign legal documents establishing their role as ‘trust anchor’ - and thus given the right to write new IDs? Is there a KYC-like process around that - if so, then anonymity is completely broken unless a trust-anchor can mint DIDs to which it has no legal obligation?

If the latter is the case, then taxing “trust-anchor vetted DIDs” at a rate of <= 1 anonymous DID per vetted DID, would be a sort of solution - but it does spell out a “special category” for DIDs - namely those who are policed entirely at the protocol level.

Unfortunately, policing edge cases at the protocol level leads to a sub-optimal design for nominal operations - my concern is that the technical developments will favor obtuse edge cases on the grounds of esoteric arguments of principle. The principle should not be compromised, but “expected edge cases” - like anonymous trust anchors generating anonymous identities that refuse to share in the global identity exchange should be such a case.

Also - from our customer inquiries - the need to “clean up an identity space” before publishing to a global environment is quite evident. This leads to the need for “temporary DIDs” - DIDs that get created, then terminated as they are merged into a more sensible identity graph.

Blocks of DIDs that have this sort of “temporary, unclaimed, status” fit well with “anonymous IDs” - i would not want to see a high PoW or PoS cost associated with a DID I expect to be “subsumed by another DID” once the owner is engaged.

Perhaps a DDO component indicating “class of DID” is worth exploring - where class of DID would be something like “permanent”, “transient, ending on:”, “pending trust-anchor claim, valid for:” - where the latter would support completely anonymous DIDs, which could be produced free of T.A. legal framework but which would be poised to either “be claimed by someone participating in the trust framework” or “be abandonded” (subject to cache disclusion)

i’d love to hear more thoughts on this - and thank you srottem for your input!


#6

This is a very interesting conversation. I’m adding some comments to the Google doc that Drummond created.

We could assume that DIDs are only created when a new claim is published from an issuer to an identity owner. Therefore the issuer most likely knows something about the identity owner in order to put something meaningful into the claim (e.g. name & address & DoB). Ergo the issuer already knows who the identity owner is so there’s no problem about anonymity.

Of course the issuer won’t know any of the other DIDs belonging to the identity owner that have been created as a result of other claims written by other issuers, so non-correlation remains intact.

Perhaps the question is “do we allow creation of a DID without an associated claim or pairwise relationship”?

Another thought to resurrect very early discussions on the Sovrin economic model. We talked back in February about those who write to Sovrin obtaining “credits” to read from Sovrin. We parked that approach as we wanted reading to be a no-cost facility like DNS. Could some sort of equivalent be created based on trust level and creation of claims (as we know that claim issuers will drive the uptake of Sovrin). So you could have a formula to determine the number of DIDs you can create based on your trust level and the number of claims you’ve issued to existing DIDs.

This is a bit like a proof of stake model, and it rewards those who contribute most to the Sovrin ecosystem. An interesting secondary market in “DID creation credits” might emerge.