Discourse Admin trust model

Backups and forking

In yesterday’s community call, we talked about forking as the final “plan B” for open source communities and the cost of it. Generally speaking, the lower the cost of forking, the more it empowers the community.

Thinking about what it would take to “fork” a Discourse instance, my first thought was to use the backup tools. But found out that backups should not be public. Because they contain a lot of private information.

  • contents of private message
  • plain-text admin API keys
  • the user table (hashed passwords, private email addresses)
  • and so on…

So in terms of forking, I think something like a static snapshot of all public data would be necessary. That’s another discussion though.

Admins have access to this data

Looking into how sensitive backups are, that made me realize, Admins can download these backups. Meaning they can see all of this data.

I think that is a massive responsibility. It’s greater than for example being a GitHub organization owner. Because the worst they can do is delete the whole org, but have very limited access to private information like this. The responsibility of handling that private data lies with GitHub as a platform. (Although that could arguably be worse :sweat_smile: for your privacy)

Trust model of Discourse Admins

Basically this means, Admins require ultimate trust. Or put differently, a “bad admin” could cause not just great disruption, but even violate privacy in the moral sense and legal sense, breaking GDPR laws.

Personally I would rather have a technical solution like e2e encryption to avoid needing to give this level of trust to anyone. But this is what Discourse can offer today. So I would like to discuss with everyone how we should handle this.

Minimal admins, bus factor

So one option, we could say let’s have the least number of admins because the risk of abuse increases with each one. That would lead to a bus factor. Losing private message I don’t think is the end of the world, but needing to ask all your users to sign up to a new forum and losing the public discussions, that would be the main disruption for me.

Bus factor mitigation admins

Following from the above option, you might argue, we should have a couple of extra admins, each having their independent backups. So it’s more likely we can recover from an admin going MIA. But each of these admins would need to be trusted with the access to this private data, they can’t share the backups in a public archive or something for the same reason.

Public data backups

We can look into tools that allow backups of all the public data. Unfortunately that means we wouldn’t be able to save functional user accounts (private emails, hashed passwords, are not public data). But would enable anyone who’s interested to keep a backup of the public discussions.

Thoughts / more options?

Would like to hear how everyone feels we should approach this. Or perhaps if we should look into other options (some 2 out of 3 encryption setup?).

3 Likes

Same. We want to minimize risk while maximizing productivity. Encryption can help. Would be great if Discourse was e2e encrypted, but that generally makes things heavier and harder to verify.

Interestingly enough blockchains combine public access and encryption, and public/private key pairs could be used to encrypt DMs, and if passwords were salted then the hashes would be very hard to decipher, but that’s not the reality of the situation today. Would be really cool if V2, V3, or VX of SourceCred had all these features.

Are we certain that there are no ways to backup data without revealing private data?

Would admins be able to download and then clean the data to create regular snapshots for the SourceCred community? (sounds like a lot of work)

Would we be able to download the data, but maybe then encrypt it via a multisig so that only a combination of trusted community members can unlock it if need be? (then you’re still trusting those community members, but then you could have 1 or 2 admins and 5-10 keys for encryption/decryption)

Thanks for bringing this up!

For reference, here’s what bus factor is lol

And losing the archive of trails and docs and things too!

Would be great if eventually we could just put SourceCred on a blockchain to have an immutable record of public data for the project.

I would say:

  • do more research to understand exactly what’s possible
  • minimize risk, but not worry about it too much as we’re still at trust level 1
  • explore moving to a blockchain (Cosmos chains are relatively affordable)

I mean the options are tied to opportunity cost. I’m sure I could build one of these tools if I spent two months. But if that’s what it takes I rather put two months towards sourcecred so I can boost someone else building it.

Same applies to research and migrating. If there’s a tool I could deploy in a day or two that would work for me. Also happy to pledge something towards that end.

1 Like

Yeah, I’m not super worried about the trust issues yet. I think keeping having a trusted contributor keep at least a monthly backup of the whole forum would be great… just in case. I’ve been doing this informally, but I’m not great at consistently repeating processes, so I’m not the best choice to be our backup keeper.

Added to the Initiatives WIshlist Category.