I think Jason Law ought to weigh in on this, but here’s my quick take:
Sovrin does not prevent storing encrypted attributes on the ledger (anything you store on the ledger is a blob, and all blobs are equally opaque). However, the Sovrin community should have an opinion about what attributes are good to store on ledger, and I believe the opinion should be: public attributes only.
When you have an attribute that everybody in the world is supposed to know (e.g., the street address of a university), then recording that attribute in a public source of truth makes sense. But when you have data that you want one (or very few, selectively chosen) entity to know, encrypting it and storing it on the ledger feels risky. Yes, you can encrypt it, and yes, the encryption that sovrin uses is industrial-strength and guarded by best practices. But the permanency of the ledger and publicness of the ledger run counter to the need for secrets; why put a hacking target out there, where all the world can see it for all time? If a key ever becomes known, old secrets may surface. The ledger becomes a honeypot (although the scope of hacking is limited, since very few secrets on the ledger would be encrypted with the same key).
Instead, we recommend using secure communication channels established and maintained through the sovrin ledger to exchange secret information out-of-band. Alice stores her public key for communications with Bob on-ledger; Bob stores his public key for communications with Alice on-ledger. When they wish to exchange a secret, they look up the respective public keys on-ledger to account for any revocation/rotation, then send the secret. Agencies help here, because “send the secret” implies that each can somehow contact the other–and an Agency can provide an addressable endpoint in the cloud that proxies an identity owner for communication purposes.
Hash-tree encryption is, indeed, used for zero-knowledge proofs–but not necessarily in conjunction with data stored on the ledger. Essentially what you have is a way to hash progressively less granular views (subsets) of a larger document, such that you can disclose a tiny portion of plain text, leave the rest of the doc encrypted, but prove that you possess the larger document because you can produce the bytes that generate it hash.
The only kind of key stored on ledger (IMO) should be asymmetric public keys (the “verification keys” used in elliptical curve cryptography). Private keys should be stored at the “edge” – that is, with the user, where they were originally generated, perhaps on a mobile device or similar – and never transmitted. Symmetric keys can be shared using a secure comm channel built from published public keys in much the same way that SSL builds a channel and then shares a symmetric session key; the symmetric keys are not stored anywhere but with the endpoints (the entities that are interacting). This makes symmetric keys ephemeral; the only revocation you have to worry about is revocation of the public key. Revocation transactions go on the ledger, so the ledger becomes the public source of truth about which key to use to talk to identifier X.
An agency can facilitate recovery mechanisms by supporting secure backups of keys known to an identity owner (where the backup is opaque to the agency, and decryptable only by the owner using a biometric or similar). It is also possible to use Shamir secret sharing to support recovery scenarios where, say, 5 friends are designated as trustees, and at least 3 friends must agree on the recovery to unlock it.