Incorporating project goals in cred scoring

Raw PageRank outputs a probability distribution. When used as scores for human display, they wind up being unintelligible, because a characteristic score would be 0.000007, which is hard to read and discouraging to the user.

To fix this, we started normalizing scores so that users’ cred sums to 1000. This has the advantage that users’ scores are usually readable numbers between 1 and 1000, and that scores are comparable across contexts. However, it also has some drawbacks. In very large repos (where tons of work has been done) users will almost always have tiny cred scores, because they are “competing” with everyone else for a fixed pool of 1000 cred. Also, once we add time-weighted cred, using a fixed cred total across periods means that users will tend to lose cred every period if there is any new activity, which could set up weird social dynamics. E.g. it could make people feel resentful of newcomers who are “stealing their cred”.

One idea I had was to increase the total cred based on the total activity in the repo. However this metric would be easy for projects (or project participants) to game by just adding more activity.

During a conversation with @mzargham and @brianlitwin, we came up with a pretty interesting idea. What if the project created “goal nodes” (kind of like “artifacts” (cf artifact plugin) or “components” (cf manual mode notes)) along with an associated amount of cred. These goals could be features that need to be implemented, bugs that need to be fixed, or OKRs, etc. Then, as people do work relevant to the goal, the graph will be (manually? semi-automatically?) updated to have edges connecting their work to the goal. Periodically, the project will assess progress on those goals, and ‘unlock’ cred proportional to progress on the goal. E.g. if the goal is to write a Discourse plugin, and we’ve assigned it a total value of 1000 cred, and the plugin is 60% complete, then 600 cred is unlocked.

That cred will be distributed based on a PageRank cred analysis, with the goal as the seed vector. This gets really nice properties: the cred doesn’t flow just to the person who managed to “land the goal” as in a bounty, but rather to all the people who participated in a goal. So you could earn cred from the goal by doing code reviews for it, or maintaining project infrastructure that it depends on, or recruiting people to work on the goal, etc. Basically we get to leverage all the benefits of PageRank, but focus it on the particular goal.

This also means that we can start to use cred to guide peoples future contributions, rather than just reward past contributions. It could be a very powerful tool for project management, letting maintainers scope out future work and define the cred rewards that will be earned by people who build those modules.

As a workflow example: As maintainer, I define the new component, the Discourse plugin, and attach it to a goal of having a viable working Discourse plugin. I can define the acceptance criteria along with the goal, what the minimal feature set is to unlock the cred, etc. Once the goal is accomplished, 1000 cred will flow to every contribution (and thus every person) who is connected to the Discourse plugin.

Someone decides to start working on it. During the research phase, they look into how other plugins work refine the scope to enumerate the major sub-components that need to be built. There needs to be a database that keeps a local cache of the data from discourse, we need to have graph creation logic that makes the graph from that database, we need to write plugin adapters, etc. The contributor creates sub-components in the cred graph for these, and sends them to me for review. We go back and forth a bit, and decide on weights for how these subcomponents are connected to the top-level goal. Based on the weights (our sense of how important each piece is to the goal), we can get an implied sense of how much cred will be earned by contributions relevant to those pieces.

Now any contributor in the project can see relatively well-scoped pieces of the project they can work on, and a sense of how important it is to the project / how much they’ll be appreciated for doing the work. This would start to get more “predictive cred” as suggested by @cbrocoum.

I should note that the cred is not “guaranteed” in a transactional sense. It’s a guide of how important the project thinks a contribution will be. However, it could be that the initial weights were bad estimates. Also, someone could earn a lot of cred by (say) merging a very buggy Discourse plugin. They would initially have a large chunk of cred. However, if it works very poorly and other people need to go and basically re-write it, the original author’s share of the cred will decrease.

Overall, I believe this system would give project maintainers a lot of ability to guide contributors in the project to important work, and give a clear sense of reward when the work is completed. If the people scoring the goals do a good job of making honest assessments, then looking at the total cred growth of the project over time will also give a valuable signal as to whether the project is successfully executing on its priorities.