SourceCred

Farming cred attacks/abuse

Moving from a discord chat.

Currently it looks like you can farm cred by spamming.

I tried reading up on the timeline cred algorithm and from what I understand. Nodes like issues, comments, PRs and PR reviews have a base value equal to the weight. That value then flows to other nodes it has a relationship to, such as the repository and the comment author.

In small volumes of spam, this would mean so long as you don’t “feed the trolls” by avoiding interacting with these threads the cred gained this way would stay very low, perhaps negligible.

But because attackers have the means to create cred out of thin air and can have some of that flow to themselves, at scale this might be a way to boost your account. An overt attack could be to use this cred-creation property to gain significant cred. For example a few accounts flood the repository with hundreds of issues and comments all feeding into each other. Each issue and comment creating a small amount of new cred and accumulating them in those accounts. That in itself could be the attack, for example to try and earn money that way when cred is rewarded financially. Or as a stepping stone to use the larger cred of these malicious users for other ways of gaming the algorithm. For example by using these spam accounts as expendable ones expecting them to be banned, but letting some of it flow to your real account hoping that will escape scrutiny.

Other means of farming are definitely conceivable. For example trying to avoid detection by trying to appear more legit. Automatically sending some “LGTM” to PRs or “What do you mean by that @some-involved-user?” (hoping to get replies as that would be more indirect cred for the attacker) to issues with a couple of comments. It would be less obvious whether they’re a bot or real user.


Some thoughts on handling this. The previous idea of cost may help: Cred, Cost, and 'Resistance' An overt spam attack would have low value but could be offset with increasing it’s cost. As it now required moderation effort. Depending on the implementation, a bot account might even work it’s way into “debt” or create such high resistance for themselves they’re in a kind of isolation, to make further spamming pointless in terms of using it to earn money.

As the thread suggested though it also means you need to figure out how to attribute costs and it has implications for all cred flow, not just abuse.

Perhaps a simpler approach would be to think of moderation tools that discourage future misbehavior. A one-off penalty in cred, temporarily 0 out a users cred for N weeks (timeline windows), permanently blacklist a user cred. The usual moderation tools. They could be encoded in the graph by a new node with specialized edges. Or could just be code paths to bypass the algorithm for enforcement. What I don’t like about this approach though is that it’s reactive. It also rewards spending precious time to deal with abuse and nullifies the damage done at best. Worst case would be you escape notice and get payed out in currency before moderation. You’re more or less in the same boat as email there. Anyone can attack, the attacker loses nothing if it fails, and defending is costly. You can do sophisticated pattern scanning (kind of like anti-malware in email), coordinated global blacklisting, machine learning for new patterns, etc. But you won’t catch everything and there’s still no reason to stop attacking.

Different from email though is, what if we protect the transition from cred to currency? When you’re paying people based on cred, what if maintainers need to approve users for this and users need to claim it with some simple protections like a captcha. That would at least eliminate overt attacks I assume. However it’s still an administrative burden.

What are your thoughts and ideas?

2 Likes

These are great questions @beanow; thanks for bringing them up!

Dimensions of Cred Defenses

Algorithmic vs Human-moderated

Ideally, we will have clever algorithms filtering down the weight of most low-value or low-effort comments. As very simple examples, we could have a heuristic that very short posts get less weight, or posts that consist of just a bunch of @-references get low weight, etc. The simple examples would be easy to implement today, although they are also pretty easy to game.

One of the principles of SourceCred is that everything should be open-source and transparent, even the anti-gaming algorithms. I think this is important – otherwise how will you know that the maintainer of a project isn’t using a black box algorithm that secretly gives higher weight to everything they or their friends write? However, transparency also makes it easier to game the system–an attacker can read the code to find exactly how to work around the spam detector.

Because of this, I think a degree of human moderation will always be needed.

Moderating content vs moderating users

Right now, the timeline cred allows moderating cred at the content, but not at the user layer. I can set the weight on your pulls to 0, but can’t meaningfully set the weight of your user to 0. (Technically, I can, but it won’t have any effect.) However, the “legacy” cred actually does allow moderating at the user level; in legacy cred, if I set the weight of a user to 0, every incoming connection to them will have 0 weight, so they get no cred.

Clearly, having user-level moderation is much more effective; trying to use content weights for moderation is a whack-a-mole game that will quickly exhaust the moderators.

I see the current timeline cred algorithm, in which every individual contribution is a ‘source’ of cred based on its node weight, as merely the latest prototype algorithm. It has nice properties, in that it naturally assigns cred over time based on activity levels in the project, and activity can often be correlated with value creation. But in the medium term, I want to switch to an approach where project level goals are the sources of cred, and the cred flows to individual contributions based on how they support the goals.

Whitelisting vs Blacklisting

One way we can think of weights setting is via “whitelisting” vs “blacklisting”. Right now we have a sort of blacklisting approach: by default, we consider contributions valuable, unless we manually lower the weight. (Of course, we could also raise the weight, increasing the perceived value.) However, if we switch to a paradigm where cred accrues at project level goals and then flows out to content from there, it might be more like “whitelisting”, where contributions get value once someone has connected them to something the project cares about. Some of this whitelisting might be automatic: e.g. a PR getting merged connects it to the codebase without additional human moderation. But to get a whitelisting-based approach working, we’ll need to dramatically lower the cost for users to add more info about what contributions are valuable, e.g. via a SourceCred browser extension.

Within Cred / Outside Cred

As @Beanow noted, we can apply moderation either within the core cred algorithm, or as an “amendment” on top. For example, if someone spams the GitHub account, we could block that user from interacting with our repository, and report them to GitHub. Also, we could do out-of-band whitelisting before we send anyone currency based on cred. (I definitely plan to do that–along with a minimum payout of e.g. $10 to reduce administration overhead.)

We could also introduce ways to directly moderate the cred output, e.g. applying a flat penalty or reduction to someone’s cred. I’m reluctant to do that right now – I’d rather explore ways to include the information directly into the data that SourceCred already consumes, e.g. by reducing someone’s weight in the graph.


Given all the above, what should we actually do today?

We definitely should build some capabilities to defend against cred gaming, because it’s quite easy right now. So we need to make more effective tools for moderation. (We wouldn’t need to change anything for legacy cred, because the tool of lowering a user’s weight is already really effective.)

In keeping with my preference to leverage the existing set of concepts (PageRank + weights) rather than add hacks on top (manual cred modification), I think we should focus on better ways of changing spammy node-weights en masse. Two specific approaches:

  1. We can add a heuristic that changes the weight on a node based on the weight of its author. Then, setting the weight of an account to 0 will mostly erase their influence on cred. (Note: this is actually pretty robust, because it prevents attacks where the spam account isn’t supposed to get cred, but is trying to route the cred to a more legit-looking account. In contrast, the way that legacy-cred enables moderation is vulnerable to rerouted cred attacks.)

  2. We can add a heuristic that if a comment receives a :-1: reaction, then its weight gets set to 0. Obviously, this tool would itself be vulnerable to abuse, so we would need to add more complexity. We could define “trusted contributors” as all contributors with > x cred, and then only trusted contributors downvotes count. But then we get a weird cyclic dependency where the cred output depends on the cred output.

Either of these approaches are simple conceputally, but will require putting some work into the ‘heuristics system’ to implement cleanly. Gun to my head, if someone starts gaming cred hardcore tomorrow and I need a quick fix, I will just implement a manual blacklist even if it’s not the cleanest API.

So for example, to penalize a user, before searching a markov chain stationary distribution, you would want to remove a user’s edges to isolate them from cred flow, or weigh those edges with 0 to the same effect?

We’re already pre-processing the edges: defaulting them 0 if they are for a future interval, and applying decay if they were from a past interval. To conveniently describe moderation, you could have a list of {node address, start, end, factor} penalties. That would allow things like, 50% cred reduction for 3 weeks, or 0% cred forever on a user level. For performance sake you would probably use {node address, start, end} to find matching edges instead and keep that in a map with the factor.

Is that the sort of within cred implementation you mean?

So, with only a high-level understanding of the algorithm, I was under the (mistaken) assumption that cred only flowed to a contribution only if another contributor with cred interacted with it in some way. E.g. if someone posted a bikeshedding comment, and nobody interacted with it, it would get 0 cred. This kind of makes sense to me, though I suppose many valid contributions may see no interaction as well… For what it’s worth, when I imagine SC in the wild, I imagine that, at a high level, normal human behavior (e.g. shaming/ostracizing unwanted contributors) driving cred could solve a lot of problems and make the systems more robust and human? Then again, perhaps that introduces more vectors for corruption and collusion, especially as money starts to flow…

This isn’t quite what I mean, for two reasons:

  1. It wouldn’t fully mitigate the attack. The attacker’s nodes would still be “creating” garbage cred; even if the nominal author of the attacker’s nodes doesn’t earn cred, that cred could still get routed somewhere else. E.g. the simple attack where spambot creates 100 issues that all say “thanks @decentralion” would route me loads of cred. And if your response is to put me in cred jail: I could have the attacker route garbage cred all over the place, so that it’s hard to tell who the intended recipient of the garbage cred is.

  2. Although this is expressed through the edge weights, there’s a sense in which it creates a new concept of node weight. There’s node weight for purpose of setting seed vector / cred allocation, and node weight for the purposes of re-weighting all the edges. So I feel like to a degree, it’s creating a new primitive instead of re-using the existing ones.

“legacy” cred kind of worked this way; something that had no “incoming” edges from elsewhere in the project would tend not to have any cred, due to the absence of a seed vector. Now, because of the seed vector, everything in the repo “owns” some cred. Doing it this way made it a lot easier to reason about time-weighted cred (things “own” less and less cred when they are old), but I still think of it as a prototype, and expect we will develop a more sophisticated and robust algorithm in the future.

1 Like