Wasabi Wallet Documentation

Hi @max, welcome to the SourceCred discourse! Super glad to have you here.

First off, nice work on the Wasabi Wallet docs. I spent a while reading it this morning and learned quite a bit about privacy in Bitcoin. I hope that as SourceCred grows it will attract the help of people as talented as yourself to help document it. :wink:

On to the matter at hand; calibrating SourceCred for use on WasabiDocs. I’ll note that @s_ben has written some great thoughts here: SourceCred for Documentation.

First, let’s take a look at how well SourceCred works on WasabiDocs out-of-the-box. Here’s the cred on zkSNACKs/WasabiDoc with the default settings:

Before digging further, I’m curious what you think of these scores. Do they seem mostly right? Do you think certain contributors seem under- or over- valued?

One of the things we talked about in our Twitter chat (and @s_ben brings up in his thread) is using # of lines changed as a heuristic for setting the weight. So I made a prototype that has this behavior so we can see the effects.

Let me briefly explain the concept of “node weight” in SourceCred. In the current implementation, a node’s weight has two effects: it increases the total absolute amount of cred in a period, and it makes cred “start flowing” at that node, to then propagate outwards according to the PageRank algorithm and the structure of the graph. By default, every GitHub contribution gets a weight as follows:

SourceCred also supports manually setting the weights on individual nodes. So, to come up with a lines-of-code experimental branch, I set the default node weight for every non-pull request to 0, and then set a manual weight for each pull request according to the following simple algorithm:

function scoreFor(p: R.Pull) {
  if (p.mergedAs() == null) {
    // no cred for unmerged code.
    return 0;
  }
  return p.additions();
}

We might want a clever-er algorithm. For example, we might want to consider just net additions, or to modify the score based on what %-age of the lines changed were net additions. But for this experiment I just went with the simplest thing.

Here is an instance using this weight algorithm:

There’s one very big difference, which is that the absolute cred scores have changed a lot – now the sum of all scores is about equal to the sum of lines of code changed. (It will be off by a bit because there’s a time decay factor, so for recent pulls not all the cred has been “issued” yet.) However, the relative rankings are pretty similar. They’re similar because the underlying graph structure is the same, and the structure of the graph has a great deal of influence on the scores.

There’s a parameter that lets us influence this: alpha, which basically determines how strongly cred returns to the seed nodes (in this case, pull requests) vs. how easily it flows across the graph. You can think of increasing alpha as increasing the “stickiness” of cred to whatever its source is.

The default UI doesn’t yet expose alpha, but I added it in for this experiment. So in the instance linked above, you can set alpha to a higher value and then click “recompute cred”. For example, if we use alpha=0.5:

image

Now with alpha=0.5, # of lines changed has a much bigger effect on the distribution, and the cred shifts a lot. Now its you, dennis, and thunderBiscuit that have almost all of the cred; others like michaelToth have basically fallen off.

If we wanted to, we could also configure SourceCred so that every person’s score is simply how many lines of code they authored. Right now, if (e.g.) dennis authors a huge PR, not all of that cred will stay with him. Some will go to him, some will flow to the reviewers or people who commented on it. Of the cred that goes to him, some will flow out from his other interactions; e.g. if he gives a thumbs up to someone else’s pull, then that pull will transitively get some cred from Dennis’s giant pull.

If we want, we can turn this behavior off, by changing the weights so that authorship is the only thing that matters: image

This has a super dramatic effect on the output cred. Now dennis has the most cred:

image

My example of Dennis having a huge PR was not hypothetical. :slight_smile: The reason that Dennis leads when we only consider authorship is because of this pull, which added 10k lines: https://github.com/zkSNACKs/WasabiDoc/pull/26

That pull references issue #2 (written by you) and mentions you and thunderbiscuit by name, so with the default settings, the cred for this giant PR smooths out over several engaged contributors. With the pure-lines-of-code metric, it all just accrues to Dennis. I think this is a good example of the robustness that SourceCred gets from using PageRank algorithm rather than just counting. It’s also a good use case for SourceCred’s ability to set manual/override weights on individual pieces of content.

I’m planning to run my own “contribution game” for SourceCred itself (see: Dogfooding SourceCred). For that experiment, I’m planning to review the weights of important contributions, to provide a bit of a manual assist to SourceCred. Some of the same considerations in that thread will apply here, e.g.: will you get paid from the contribution game?

I’ll stop here, curious to get your thoughts!

(Also: I used my moderator tools to boost you out of “new user” status, so you shouldn’t have any more issues re: number of links you can post.)

2 Likes