DID documents and URIs

Hi,

In the documentation it is being mentioned that a DID points to a DID document. According to the glossary, a DID document contains “Public Keys, Service Endpoints, and other metadata associated with a DID”. On GitHub, an example of a DID document is given:

{
  "id": "did:sov:mnjkl98uipsndg2hdjdjuf7",
  "publicKey": [{
      "id": "key1"
      "type": "ED25519SignatureVerification",
      "publicKeyBase58": "...",
      "authorizations": ["all"]
    }],
  "authentication": [{
      "type": "ED25519SigningAuthentication",
      "publicKey": "key1"
   }],
  "service": [{
      "type": "agentService",
      "serviceEndpoint":"https://www.sovrin.org/agents"
   }]
}

I have a question about the serviceEndpoint entry in the snippet above. Is it mandatory to include a URI in a DID document?

The reason I’m asking is because I’m concerned about linkability. If a user, Alice, uses the same edge agent in all of her relationships, and uses distinct pairwise pseudonymous DIDs (i.e. anonyms) in multiple attribute disclosure sessions with the same verifier (lets call him Bob), then Bob would be able to correlate these sessions to one another by looking at the URI in the DID documents that the individual DIDs (i.e. anonyms) refer to. After all, if the URI refers to Alice’s edge agent, then presumably the URI is the same in each of these DID documents?

Of course this (potential) issue only occurs when edge agents are being used. If Alice would have used a cloud agent, correlation would have been more difficult as multiple users would share the same endpoint URI. Or don’t they?

As Sovrin is a well thought-out ecosystem (albeit still in development), I’m probably missing something. I’m looking forward to hear where my thinking goes wrong :slight_smile:

Thank you in advance!

Best regards,
Ruben

Is it mandatory to include a URI in a DID document?

No.

Bob would be able to correlate

Notice that the example URI–https://www.sovrin.org/agents --is NOT specific to Alice; it contains no DID ID, no name for Alice, no other identifier that binds the URI to her. In fact, we would expect thousands or even millions of DID Docs to have that same URI, including many from Alice, but also many from numerous other people.

This creates what is called “herd privacy”, because Alice’s DID is just one of many DIDs that use an identical URI.

In DID Communication, this type of endpoint acts as what is known as a “routing agent”. For companies or IOT devices that don’t care about privacy, it can just be their agent listening–but for private individuals, it need not a URI operated personally by Alice; rather, it can be an organization that has agreed to accept messages for many agents, and to route them on to the specific agent that represents a given DID, based on metadata on the outside of a message’s encrypted envelope. You can think of it like a mail clerk at a post office, and you can think of the service endpoint as postal code that covers a large geographical area. Effectively, this service endpoint is like Alice saying to whoever sees the doc, “You can get mail to me if you deliver it to the New York City post office.” Since lots and lots of people will be saying the same thing,Alice remains anonymous enough.

Now, you might say, “If the URI isn’t specific to Alice, how does the mail clerk know enough to get it to her?” The answer is that Alice leaves instructions with the postal clerk, saying “Hey, if you get an mail addressed to DID xyz, please forward it to me at the following location.” This is not a public statement; it’s a private statement from Alice to the clerk. Thus, the public sends to Alice at a generic location, but the clerk turns it into a specific destination and Alice receives it privately.

The next logical question might be: “But doesn’t that mean the clerk can destroy Alice’s privacy?” The answer is “Possibly. Alice certainly shouldn’t use a clerk that she considers malicious. But Alice doesn’t have to use the same clerk for all her relationships, or even the same clerk for a single relationship (she can have multiple endpoints, and rotate them as needed). Plus the clerk doesn’t know what is inside the envelopes Alice receives, OR the sender of the messages. It only sees the destination. Importantly, the clerk is not used for outgoing transmissions, only for incoming.” Still, having a properly behaved mail clerk (routing agent) is important, and this is why Sovrin has begun describing minimum behavior standards for what is called an “agency” (a provider of routing services).

Hi Daniel,

Thank you for you elaborate reply! It really gives a good insight in how Sovrin agents works.

Reading your explanation, I feel like that I initially misunderstood what a cloud agent typically entails. So, just checking whether I understand it correctly now: Routing messages is the primary use-case for cloud agents? Or would you rather say that a routing agent is a specific type of cloud agent, specialized in forwarding messages?

Importantly, the clerk is not used for outgoing transmissions, only for incoming.

Alice sends her messages to the receiver’s routing agent, which in turn will forward them to the receiver’s edge agent, right? This makes an interesting case when Alice and the receiver use the same routing agent, because the agent then is at the center of the relationship between Alice and the receiving party. However, even then, I suppose that the knowledge of the routing agent is rather limited? It only knows the network location of the agents that represent the DIDs of Alice and the receiver, and the IP addresses of these agents (unless Alice/receiver use a VPN or the Tor network, for instance). Nevertheless, as Alice (and possibly the receiver) use pairwise pseudonymous DIDs, the routing agent cannot correlate individual relationships with one another. Furthermore, because the contents of the messages that are being forwarded are encrypted, in terms of confidentiality the routing agent probably poses little threat also?

Routing messages is the primary use-case for cloud agents?

The vast majority of the routing agents will probably be cloud agents, but not all cloud agents will do routing. Institutions and IoT devices may simply have cloud agents that directly receive messages, with no intermediate routing involved.

Alice sends her messages to the receiver’s routing agent, which in turn will forward them to the receiver’s edge agent, right?

Right.

even then, I suppose that the knowledge of the routing agent is rather limited?

Yes. As you point out, the sender can use TOR to obfuscate. Also, the ~timing decorator on a DIDComm message can be used to request random delays in when a message takes the next hop, which can defeat temporal correlation. And the forward message that acts as an outer envelope can be nested multiple times, causing a message to bounce around through various routing agents before arriving at the last one in the journey. But these are advanced features that many agents might not use, so it’s good to be concerned about this issue.

because the contents are encrypted, in terms of confidentiality the routing agent probably poses little threat

Right.

Hello Daniel,

Thank you again for taking the time to respond! Your posts are really helpful.

Institutions and IoT devices may simply have cloud agents that directly receive messages, with no intermediate routing involved.

I totally overlooked the use-case of Sovrin in a machine-to-machine setting. Thank you for pointing out that cloud agents are not merely used by ‘real’ persons.

Thanks again!