(Note: there’s so much to reply to in this thread! I think we should maybe start biasing to splitting off into other threads with new ideas. For now, I’m going to try to do one-reply-per-post and engage with a lot of really interesting comments. :D)
I’m not the CEO of SourceCred (which isn’t a company, and doesn’t have investors, executives, or a CEO). So I also don’t have some big claim on future economic rewards from SC that I get to monopolize and then dole out to other insiders. If/when SourceCred does a token distribution or similar, it should go through SourceCred’s own public cred attribution, without any special “insider mode” where the people running the project get to pay themselves big opaque bonuses. If, when the distribution is actually happening, I defect and try to do this, you should throw these words back in my face and organize a fork of the project.
(You might accurately suspect that I don’t subsist on blockchain pixie dust. I’m an engineer on @Protocol Labs’ payroll; they’ve decided to support the project, and that support comes in the form of paying people to work on it full-time. When we get around to a full, nuanced cred distribution, I think PL should and will receive a lot of cred, both for funding and a lot of organizational/operational support. Figuring out exactly how to do this fairly will be a really interesting question.)
This is a really interesting idea. I’d love to read more about your thoughts here, and suggestions for how we could apply this insight to SourceCred. Right now our infrastructure is focused on recognizing the value that devs provide (looking @ Git and GitHub) but the Odyssey Hackathon is coming up, and I think we’re going to focus a lot there on building tooling that makes it easier for us to recognize other contributions. So, I’d like to experiment with your theory by doing a better job of valuing and rewarding all the other labor that SC needs to be successful.
Can we support you in learning JS, via hacking on our codebase? The Live Coding Session we did a few months ago could be useful for orienting on the tools in the codebase and how to get started.
As for specific approaches on time-weighted cred: in the long term, I expect we’ll do time-filtering via “Personalized Pagerank”, where we have the seed vector for exploring the graph start by pointing to all the contributions in a certain timeframe, and then see where cred propagates from there. In the short term, I think that something like just time-filtering the contributions could work well. Suppose that C(t)
is the cred distribution for all contributions up through time t
. Then if we want the cred just for the interval [t, t+1]
we could define it as C(t+1) - C(t)
where subtraction is just elementwise score differences. This is pretty easy to hack; you could do it right now by just running SourceCred one week, then re-running it a week later and seeing what the diffs are.
For the goal of having a robust, reliable API, we’ll obviously need to actually integrate this into the system in a more principled way. Actually getting the data from GitHub is straightforward, we can modify the schema to get creation dates for all the posts. However, it’s not totally clear how to incorporate timestamps into the Graph. Giving every node a timestamp would be a bit odd because some nodes, like user accounts, don’t have natural timestamps associated with them. @mzargham has proposed organizing the graph to have different fundamental types of nodes including an “Event/Post” type which would have a natural timestamp associated, so implementing that system would be one way of doing this in a principled way. As a hack, I might be willing to condone adding a nullable timestamp field to nodes in the graph, although I suspect @wchargin might object.