If you’d like to diff the full cred instance before and after the new weights, check out:
(In the future, we’ll have fancy cred analysis notebooks to dive into changes like this even more clearly.)
If you’d like to diff the full cred instance before and after the new weights, check out:
(In the future, we’ll have fancy cred analysis notebooks to dive into changes like this even more clearly.)
Overall, re-balancing the CredSperiment weights seems like a healthy thing for the community.
A few things to consider:
I don’t have answers to all these things. TBH I’m still wrapping my head around how the SourceCred protocol works on a technical level and how that relates to Cred flows. As such, I’m not (yet) qualified to have a super strong opinions on how exactly weights should be configured. I do, however, have an intuition that the appropriate Cred weighting (at this time) is probably somewhere between the old version and the proposed new version. On top of that there’s some larger ideas here that are important to explore so that we can setup SourceCred for positive sum growth. Curious to hear what everyone else thinks as well as if this should/could be posted to the public forum
First, I want to apologize that the post I shared was not clearer. The post I shared was meant to highlight some points that needed further thought and exploration. It was not at all meant to be a definitive statement.
Second, I by no mean intended to diminish the importance of your contributions. I’m not super active on the GitHub front so I’m not aware of everything going on in that domain. I saw the GitHub action repo (which is super cool btw!), but not much else. This led me to believe that GitHub was overly weighted vs other contributions. That’s the only point I was trying to make. There was no judgement about the quality or value of contributions, but that it seemed like a single contribution onGitHub was being valued more than ongoing contributions on Discourse (which I now know is inaccurate).
Third, sorry again for the confusion. I really value all your contributions. The entire goal of the post was to highlight how it’s important to make the Cred weights such that contributors (such as yourself!) are recognized and rewarded, but it seemed like the newly proposed Cred weights might consolidate Cred. There’s lots of people contributing a lot of value to the project in a lot of ways and it’s really important that we setup SourceCred to recognize and reward all those contributions. This way everyone feels valued and appreciated and we can create value together in a positive sum game
You raise some interesting questions @burrrata. As I said on the Spontanious Community Call (thx for the great writeup on that btw!), I’m OK with the new weights. They make sense to me, even if my share has gone down (for now).
I need to think more on some of these issues, but just thought I’d share my 50,000 ft view of some things:
So @mzargham built a python implementation early on in sourcecred/research, which he used when exploring designs. However I believe it’s fallen out of sync with the main repo, which has seen big changes.
Python is the only language I’m half fluent in actually, and have had lots of ideas for tinkering with the scores. Just need to find that time…
Yeah…Discourse may be overvalued except that we’re kind of relying on it for all non-code work. Until we have incentives, boosting, and other platforms for strategy, biz dev, documentation, etc. etc. etc., it’s difficult to value it. We also want to be compelling to developers, who are scarce…Don’t have strong opinions, just wanted to input some info into the process.
The core value prop of SourceCred is that it recognizes and rewards contributions. That involves creating a contribution graph that gives contributors reputation that can then be correlated to rewards and potentially governance. Traditionally problems like funding, governance, and reputation are separated and that’s part of the reason why they’re so hard to solve: they’re not actually completely separate things. They all connect to each other. SourceCred acknowledges the interdependence of things and connects them in a way that is incentive aligned for positive sum value creation.
While we’re still in the CredSperiment, however, it’s important that TBD (Temporary Benevolent Dictator) helps to direct and organize things.
If we choose to deploy a bonding curve to handle our token sales there will be no negotiations, only discussions and presentations to share the value prop of the protocol
So cool! I had no idea this was a thing. Created an Initiative for this as part of the Retroactive Cred Activation.
Thanks for pushing back @burrrata You make a good point about the interdependence of funding, governance, and reputation. What excites me most is that SourceCred is exploring new territory in the multidimensional “solution space” here. We should not confine ourselves along traditional lines prematurely.
SourceCred works really well IMO within a single repo, but once you merge contributions across even repos, using the same programming language, the scores lose meaning. Activity is just different in nature. Therefore we need a higher level “incentives compiler” to value across projects. For now, that is done by @decentralion via manual mode (manually changing weights for contributions).
Have you seen @mzargham’s article on pagerank in SourceCred?
I don’t pretend to understand how the algorithm works on a deep level, but this was a good primer for me.
Apologies for the unfair comparison earlier. Boosting is just the general term we’ve been using for a long time for valuing contributions and/or contributors more in the graph though. It’s not intended to be a negative term (exactly the opposite! we boost things that are undervalued!). I don’t think we should change the term boosting. Just be more careful in language generally.
One use of boosting that we’ve discussed is for ‘recruitment’, because there’s currently no way of paying someone a real wage from the get go, without paying them a salary outside of the system (which could skew incentives). Building up enough cred to pay the bills currently could take longer than is feasible for many developers. Boosting would be a way for people to express, with skin in the game, that they want to bet on the future value of a contributor. They will actually get a share of that contributor’s cred too, so the booster is taking on risk and reward. This is only at the concept stage for now, but could be a good way to get contributors we couldn’t otherwise.
Glad to have you aboard!
I think SourceCred can (and will) mean many things to many people. A working reputation system is very powerful. Personally what excites me most is the money aspect. Being able to make a living working on open source. But it’s still an experiment that can go in many directions.
For more on boosting, the Initiative for it is a good overview.
One of the features that I think will make comparing activity across repos feasible.
If you want to learn more about Boosting I recommend the Boosting: a prediction market on ideas thread. FTR, however, it’s still in the design phase and the mechanisms to make it possible have not yet been created. The Cred Boosting Initiative should have the most recent status and dependencies to make Boosting a reality.
Also, for a high level overview of SourceCred the SourceCred in 5 minutes thread is a great place to start
Indeed, those messy humans…when I wrote my article on SourceCred and DAOs, I was depressed how intractable many of these problems are. Not really any working solutions yet, though I think SourceCred approach is starting to work already:)
Have you seen the podcast interview with @mzargham? Recored a while ago but we haven’t put it out on social media yet. Goes into some of these issues.
Thanks everyone for participating in this discussion. On a meta-level, I’m glad to see this conversation happening, uncomfortable parts and all. The difficulty around deciding what “fair” is for a system like this, and negotiating changes to it in a way that lets people still feel respected, enfranchised, and heard, is one of the biggest challenges for us (and for anyone who uses SourceCred). As @anon60584824 mentioned, the “human bits” of deploying a system like SourceCred are what makes it really challenging.
I really appreciate that in this thread, everyone has made a clear effort to respect and care for “the human bits” of other SourceCred contributors: working to understand others’ perspectives and intentions with good faith, hearing when feelings have been hurt and apologizing, and generally being supportive and honest with each other. Our ability to thrive is going to depend on our ability to support each other in these ways, just as much as it will depend on our ability to implement increasingly clever algorithms.
With that said: I think a lot of the challenges that have been surfaced in this thread reflect the immaturity of the SourceCred algorithm and parameter system. The promise of SourceCred’s graph-based approach is that we can mint cred at the level of things we really care about – goals like deployability (as supported by having a docker container and GitHub actions) or documentation (as supported by having a bevy of Getting Started guides on the Discourse). Once we have a supernode that represents the high level goal, we can tune the cred flow for the high level goal, and that will propagate out to everyone who worked on it.
However, we haven’t finished building the system that lets us do this. @Beanow is hard at work on the initiatives plugin, and afterwards we need to refactor the Graph to give us better APIs for controlling cred minting. It will probably be somewhere between “weeks” and “two months” before the more sophisticated, goal-based cred minting process is online.
While we’re waiting for that better system, we have an extremely crude alpha version, which mints cred solely on the amount and type of activity that SourceCred observes. As such, we mint cred for every Discourse post, for every GitHub pull request, and so forth. We can tune the cred minting (and thus the relative cred weights) by changing those overall weights.
This is a really crude system, kind of like hitting the cred distribution with a cudgel. Any weight we set for Discourse posts will either be too low for super in-depth guides and detailed discussion of the system, or too high for a placeholder post retroactively initializing an initiative. Any weight we set for GitHub pulls will either be too low for an involved refactor which increases Discourse performance, or it will be too high for a tiny PR to remove a deprecated file.
Ideally, we would already have next-gen system online, and I wouldn’t need to make crude and contentious changes to the weights. But, we’re not there yet, and I have an ongoing responsibility to tune the system to try to accomplish a few goals:
When I first turned on the system, we had more than a year of GitHub contribution history, and very little activity on the Discourse. When I gave the system what I considered “reasonable weights”, it responded by giving almost all of the cred to myself and @wchargin, due to our very long history of code contributions. However, I felt it was important to cultivate the nascent Discourse community, and give meaningful cred to people who had started contributing in non-code ways.
So I deliberately chose weights that were unreasonable: a Discourse thread had 8x more weight than a GitHub pull request; a Discourse post had 32x more weight than a GitHub comment. Looked at directly, these weights didn’t really make sense. However, choosing these weights made for the cred distribution I felt was healthiest for the project at the time. I don’t regret setting the weights this way, but I should have anticipated and more clearly communicated that they would need to once Discourse activity levels approached our GitHub activity levels.
Now, several months later, we have a thriving Discourse community that is accomplishing a ton: exploring and explaining how SourceCred works, coming up with norms and patterns for our community, dog-fooding new features like Artifacts and Initiatives. This is great! But it also means that, because I chose unbalanced weights earlier, our overall cred is becoming very unbalanced. We’ve reached the point where the cred earned from a single week focused on Discourse can outpace the cred earned from several months of work on GitHub.
While we are still working with a set of crude levers to tune the cred, I still need to use those levers to do the best I can to keep the overall system in alignment. In this case, this means a change to make the Discourse and GitHub weights more consistent. If we had the tools to do so, I’d prefer to change the weights prospectively, not retroactively, and “stand by” our past weights (perhaps up to the most recent week or two). Unfortunately, we just don’t have that capability yet–and the result is a fairly disruptive change to cred.
The good news is, just as we can retro-actively update the cred now, we will retroactively update the cred with the new tools we develop in the coming months. Therefore, you can think of this current cred distribution as a temporary “working copy” that we’re going to keep improving. And the way Grain distribution works always tries to keep alignment between the most recent Cred and the lifetime Grain flows; therefore, people with too little cred right now will be “made whole” once we launch the improved system.
We’ll also launch a new UI which allows exploring cred at the level of initiatives, artifacts, and other supernodes. So rather than SourceCred explaining cred in terms of raw activity, like so:
It will be able to communicate my cred in terms of specific initiatives or artifacts I’m connected to (like setting up the core graph module and algorithm, or my leadership roles within the project as a whole). I think that will help these discussion a lot, by making it easy to see why other people have Cred, in terms of the specific initiatives and artifacts they supported.
This won’t be a perfect system either, and we’ll keep on having friction as we work on SourceCred. But I hope by improving the algorithm, we we’ll be able to make more intentional changes to the system that won’t be as frustrating or lead to so much churn in the cred.
In the mean time, we’re still very much in CredSperiment mode, so we need to accept there will be some turbulence in deploying our still-experimental and alpha quality system. Thanks for being along for this ride, y’all!
This thread has been a great learning experience, but I don’t see a clear consensus on the actual CredSperiment weight changes. It’s important that we value all types of contributions and create a welcoming and rewarding environment for contributors. To achieve this I suggest either decreasing the Discourse weights or increasing the GitHub weights, but not both.
Took me a while to pitch in on this thread, but have been reading and want to briefly add some of my thoughts.
I’m agreeing on both major points here. We need to be careful in our language, and acknowledge that settling on a set of values is a subjective human process.
My feeling is the last part is making it’s way into the system design for a while now. Perhaps it’s an idea to start building recommendations / guidelines on how to have these discussions in a constructive way.
As an example: I would recommend entirely avoiding using people’s total cred scores in a value discussion. Instead try to answer the question “what do we care about as a community?” and when bringing in examples, use concrete examples. Such as “This PR” or “This Initiative”. While still being mindful that concrete work has people working on it too.
I’ve got a couple more thoughts on this, but feel like this deserves it’s own thread
As @decentralion explained well. The activity based minting of cred is flawed. So we won’t solve every problem raised here, without moving away from this system.
Overall I think the suggested change is OK, with one main comment. And from my experience perhaps even conservative.
My largest concern for the weights is inclusion.
With the new weights, an “I’m new and have a question” topic would be valued as much as a PR from someone who’s familiar with the project. seems inclusive to me.
The previous weights were very poor for developers. And I’ll expand on that. With the new weights, it’s good enough, though maybe conservative. And I’m happy with that. It will prevent gaming PRs and we should use supernodes to make up the difference.
When I think about writing detailed topics, like About Champions and Heroes. This would take me about 2 days. That includes watching the reference video, fleshing the idea out on a notebook, sleeping on it, discussing the term on discord, drafting the topic, getting images, finalizing the post, etc. (Although granted, I’m a developer so doing this every day would drive me nuts )
On the other hand it’s taken me roughly from October 20th, till December 9th to implement Discourse mirror revision. With some back of the envelope math, that’s 50 days for 18 PRs. Making about 2-3 days per PR on average. And I’m certain the Initiative system will have a slower rate than that.
For deeper context of what this is worth: Champions is still a concept that isn’t fully defined and explained, but has been a useful metaphor. While just 1 of the 18 PRs includes #1431 which has made it possible for anyone to use SC on any public Discourse instance, no longer needing API keys to do so.
So my hunch would be, a good PR is more expensive to create than a good forum topic. Even though my skills specialize on the PRs.
In the before snapshot cred, I have 859 Cred. Which is approximately 70% Discourse, 30% GitHub.
In the after snapshot cred, I have 1068 Cred. Which is approximately 21% Discourse, 79% GitHub.
Obviously the above approximation is really flawed. But it’s close enough to support what I’m getting at next.
In my experience current weights do not make development viable. When a topic receives 8x more cred than a PR and takes me about a day less to work on, there’s no incentive for me to write PRs.
Or to invert that, I would need to put in an extra day of work, and would be rewarded with 12.5% of what I would have gotten for a topic.
The suggested weight change doesn’t fully close the gap from my heuristics / intuition. But I think it greatly improves inclusion of developers.
In the time I’ve been with SourceCred, I’ve rarely put as much effort into an Issue as I have put into Topics. But Issues are most definitely worth less than a PR.
Using Discourse for many situations is preferred over issues in our community, for visibility and inclusion.
Conversely pull request reviews are really valuable. So 2x for them makes sense to me.
The counter argument I could make for a 2x issue weight is: to incentivize new people to give feedback and bug reports.
(Assuming our existing developers won’t game this by creating an issue for every PR they submit and fixing it immediately after to pad their scores)
This particular point I think is interesting to touch on too.
Right now, we have one bridge between GitHub and Discourse Cred. Which is through Identities. Which I mentioned a bit here: Add identity for vsoch by Beanow · Pull Request #9 · sourcecred/cred · GitHub
I’m working on another major way to bridge GitHub and Discourse Cred. With Unified reference detection.
With this system in place, it will be much easier to flow cred directly between nodes and across the different environments. So posts can link to issues, can link to PRs, can link to topics, can link to users, …
I’m now going to reply to a lot of specific points that were brought up in the thread.
I agree; I want to extend review culture and workflows from code contributions to all major contributions to SourceCred, including artifacts and documentation. Moving the source of truth for artifacts into source control in GitHub would make a lot of sense. Watch this space.
Y’all are right, it doesn’t make sense for the weights to be the same. I propose new weights at the bottom.
Yep, having sources of income and security external to the project make a big difference for contributors, especially in this immature stage of the project. The goal of SourceCred is to recognize and reward people fairly for the value of their contributions. In the case where people are earning salaries for working on SourceCred, I think it makes sense for the companies paying their salaries to receive a portion of their Cred; this makes sense because the entity paying the salary is helping to enable the work. This will also make SourceCred easier to adopt in open-source projects where corporate sponsorship is happening, make it easier for employees to contribute to SourceCred, and enable more SourceCred contributors to have diversified economic support. I’m planning a fuller post on this subject soon, please hold future discussion of this point for that post.
Our inability to have time-scoped weights is definitely a bug and not a feature. What I care about most in this change is the way it changes the future incentives (i.e. we need more reward for developers), not the way it changes the past history. We should develop the ability to change weights in a time-scoped way, it will be a valuable tool in our toolbox going forward. (To do this, we need more developers ).
I quite agree with this. I think that handling funding, governance, reputation, and rewards together, we are more likely to create a viable, functioning system than if we look at just one part in isolation.
Thanks for this ask, @anon60584824. I think one thing that’s really clear from this discussion (as a few others have mentioned, too) is that we need to develop clear guidelines around how to have these discussions. One of these guidelines will be: focus on the discussion on valuing contributions, not the contributors.
This will also mean making changes to the UI. Right now the UI strongly guides you to focus on the impact on contributors’ cred totals, because that is the salient information in the UI. I think we should change it to show things like:
Having the UI show this information will guide us all away from focusing on the (zero-sum) game of who has the most relative cred, and towards the (positive sum) game of aligning on what kinds of contributions we want to reward and recognize.
@Beanow, thanks for explaining these dynamics; this has been on my mind for several weeks, and is a big part of the motivation for these changes.
Right now, so much in SourceCred is blocked on development. For example, just things that came up in this discussion:
We need to ensure that development activity is rewarded, because as a community we have a hard dependency on developing more tools.
I’m going to come up with a new weight change proposal based on the feedback on this thread. Concurrently, I’m going to take a stab at prototyping a new UI which follows the guidelines we’re coming up with to focus on evaluating contributions and not contributors.
If I can prototype that UI fast (i.e. by tomorrow night) then I’ll post the weight proposal with the prototype UI. If not, I’ll just use the existing UI.
The CredSperiment payout for last week will be delayed until the new weights merge.
Based on the feedback here, I’ve come up with a new set of weights:
Node Type | Old Weight | New Weight |
---|---|---|
Discourse Topic | 8x | 8x |
Discourse Post | 2x | 2x |
GitHub Pull Request | 1x | 16x |
GitHub Issue | 1x | 4x |
GitHub Comment | 1/16x | 1x |
GitHub Review | 1x | 8x |
These new weights incorporate two major points of feedback:
I also gave a relative boost to reviews, as code review is very important.
I’ve also taken a stab at my suggestion of changing the lens, by building a “prototype UI” that analyzes the weights through the lens of seeing what contributions we value, instead of the lens of which people gain cred. The results were illuminating.
Here’s a set of four bar charts showing the cred-by-activity-type under the old weights (left) and the new weights (right). It also shows the total cred across all time (top) and just the cred for the last full week (bottom).
Here are the things that really stand out to me:
I propose that for now, we focus on getting the “latest cred” totals right first. The reason is this puts us on a tighter feedback loop: each week we have a pretty clear memory of what happened in the week, and how much we value it, so we can more easily come to agreement on what cred “makes sense”. Once we’ve gotten good at latest cred, we can orient on the harder problem of all time cred.
Looking at it through this lens, we still have room for improvement – rewarding development activity with only 20% of the latest cred is too low, especially considering how blocked we are on development. However, since 20% is much more reasonable than 5%, this weights change looks like an improvement.
Acting as TBD, I’m going to merge the new weights, and distribute corresponding payouts, so we can keep to our normal payout schedule (+/- 12 hours). However, I’m happy for this discussion to keep going. The CredSperiment is… an experiment! And we can change the weights every week if we want to. (Eventually we will settle down and Cred will become more stable, but that’s still a while out.)
Of particular note–as we improve the underlying infrastructure (i.e. move away from activity cred), we’ll get ever more powerful tools for configuring SourceCred in a way that rewards the contributions that we need and appreciate the most.
Thanks to @wchargin, who made major contributions towards enabling cred analysis notebooks. The data analysis in this post was done in a prototype cred analysis notebook.
@decentralion Very interesting! I’m eager to try this visualization out.
One thing I do want to point out is that we should be careful using just one week as our benchmark. Because these values are highly influenced by activity, and activity is unpredictable and seasonal. (Interested what @mzargham’s thoughts on this are)
Most of my PRs don’t happen at a regular 2-3 day intervals. They happen in bursts. A big reason why I didn’t do much on GitHub the last weeks was because of the holidays. Being with the family I could sneak in some forum posts, but didn’t have much opportunity to sit down for a focused developing session.