Farming cred attacks/abuse

decentralion · July 27, 2019, 5:34pm

These are great questions @beanow; thanks for bringing them up!

Dimensions of Cred Defenses

Algorithmic vs Human-moderated

Ideally, we will have clever algorithms filtering down the weight of most low-value or low-effort comments. As very simple examples, we could have a heuristic that very short posts get less weight, or posts that consist of just a bunch of @-references get low weight, etc. The simple examples would be easy to implement today, although they are also pretty easy to game.

One of the principles of SourceCred is that everything should be open-source and transparent, even the anti-gaming algorithms. I think this is important – otherwise how will you know that the maintainer of a project isn’t using a black box algorithm that secretly gives higher weight to everything they or their friends write? However, transparency also makes it easier to game the system–an attacker can read the code to find exactly how to work around the spam detector.

Because of this, I think a degree of human moderation will always be needed.

Moderating content vs moderating users

Right now, the timeline cred allows moderating cred at the content, but not at the user layer. I can set the weight on your pulls to 0, but can’t meaningfully set the weight of your user to 0. (Technically, I can, but it won’t have any effect.) However, the “legacy” cred actually does allow moderating at the user level; in legacy cred, if I set the weight of a user to 0, every incoming connection to them will have 0 weight, so they get no cred.

Clearly, having user-level moderation is much more effective; trying to use content weights for moderation is a whack-a-mole game that will quickly exhaust the moderators.

I see the current timeline cred algorithm, in which every individual contribution is a ‘source’ of cred based on its node weight, as merely the latest prototype algorithm. It has nice properties, in that it naturally assigns cred over time based on activity levels in the project, and activity can often be correlated with value creation. But in the medium term, I want to switch to an approach where project level goals are the sources of cred, and the cred flows to individual contributions based on how they support the goals.

Whitelisting vs Blacklisting

One way we can think of weights setting is via “whitelisting” vs “blacklisting”. Right now we have a sort of blacklisting approach: by default, we consider contributions valuable, unless we manually lower the weight. (Of course, we could also raise the weight, increasing the perceived value.) However, if we switch to a paradigm where cred accrues at project level goals and then flows out to content from there, it might be more like “whitelisting”, where contributions get value once someone has connected them to something the project cares about. Some of this whitelisting might be automatic: e.g. a PR getting merged connects it to the codebase without additional human moderation. But to get a whitelisting-based approach working, we’ll need to dramatically lower the cost for users to add more info about what contributions are valuable, e.g. via a SourceCred browser extension.

Within Cred / Outside Cred

As @Beanow noted, we can apply moderation either within the core cred algorithm, or as an “amendment” on top. For example, if someone spams the GitHub account, we could block that user from interacting with our repository, and report them to GitHub. Also, we could do out-of-band whitelisting before we send anyone currency based on cred. (I definitely plan to do that–along with a minimum payout of e.g. $10 to reduce administration overhead.)

We could also introduce ways to directly moderate the cred output, e.g. applying a flat penalty or reduction to someone’s cred. I’m reluctant to do that right now – I’d rather explore ways to include the information directly into the data that SourceCred already consumes, e.g. by reducing someone’s weight in the graph.

Given all the above, what should we actually do today?

We definitely should build some capabilities to defend against cred gaming, because it’s quite easy right now. So we need to make more effective tools for moderation. (We wouldn’t need to change anything for legacy cred, because the tool of lowering a user’s weight is already really effective.)

In keeping with my preference to leverage the existing set of concepts (PageRank + weights) rather than add hacks on top (manual cred modification), I think we should focus on better ways of changing spammy node-weights en masse. Two specific approaches:

We can add a heuristic that changes the weight on a node based on the weight of its author. Then, setting the weight of an account to 0 will mostly erase their influence on cred. (Note: this is actually pretty robust, because it prevents attacks where the spam account isn’t supposed to get cred, but is trying to route the cred to a more legit-looking account. In contrast, the way that legacy-cred enables moderation is vulnerable to rerouted cred attacks.)
We can add a heuristic that if a comment receives a reaction, then its weight gets set to 0. Obviously, this tool would itself be vulnerable to abuse, so we would need to add more complexity. We could define “trusted contributors” as all contributors with > x cred, and then only trusted contributors downvotes count. But then we get a weird cyclic dependency where the cred output depends on the cred output.

Either of these approaches are simple conceputally, but will require putting some work into the ‘heuristics system’ to implement cleanly. Gun to my head, if someone starts gaming cred hardcore tomorrow and I need a quick fix, I will just implement a manual blacklist even if it’s not the cleanest API.

Topic		Replies	Views
Sneak peek: SourceCred Discourse Plugin The CredSperiment	4	2230	August 19, 2019
Positive Cred Velocity Research	14	2909	December 11, 2019
A Gentle Introduction to Cred Research	10	3455	December 29, 2019
On Code Review Cred The CredSperiment	12	1712	October 5, 2019
CredSperiment Progress Report The CredSperiment	11	2692	September 27, 2019