Timeline Cred Prototype

After much late night hacking… I now have a live timeline cred prototype to share!

Here’s what the UI looks like:

The basic approach is that we slice the history of the project by week. For each week, we run PageRank and assign scores, with some time-based twists:

  • we set the edge weights based on time: edges that don’t yet exist have 0 weight, edges that were just created have full weight, and older edges have their weight decayed exponentially
  • we run PageRank with a seed vector pointing to all the nodes created in that week
  • we normalize the scores so that the total user score in that week equals the sum of the node weights for every node created in that week. So if a pull request has a node weight of 4, then creating one PR in that week ==> 4 more cred to be split across users

I’ve found the results really interesting so far. It’s cool being able to see when people got involved in a project and when they dropped off, and to see how the total activity level in a project has varied over time. In the case of SourceCred, you can see that when @wchargin and I were pair programming together every week, we had a lot of output; after he joined Google, my output also decreased.

I would love feedback on this feature, e.g. how to improve the UI.

Thanks to @mzargham for a lot of ideas that went into timeline cred.

Here is a live prototype you can play with!

3 Likes

Very interesting!

Wasn’t sure how I felt about the chart at first glance, but the more I looked at it the more it made sense. The time-varying amplitudes convey the general amount of work done each week. Zooming in I can get a rough sense of the relative amounts each person contributed.

I didn’t have an aha! moment however until I ran this on the main repo I’ve been contributing to for ~10 months, Decred’s documentation repo. This repo is almost entirely markdown files containing documentation. Very little coding. I don’t yet have a good reference point to compare this to projects I’ve contributed code to, but the initial results seem promising. The chart already reflects my understanding of the evolution of the repo pretty well.

Here’s a screenshot of the chart for last year.

I’ve also used @decentralion’ s github pages hack to host a live version if anyone wants to explore (NOTE: cred to @decentralion for script).

Looking at this graph is a lot more meaningful, as I can immediately start seeing events I recognize, telling myself stories (‘oh, there’s where I started contributing!’, ‘there’s the maintainer I work with that has lots of cred…’, ‘there’s a period of inactivity for a couple months before I started, wonder what was happening then’, ‘who is that early contributor i’ve never met that did tons of work ?’, and so on.

As for the UI, a couple impressions:

  • I’m finding myself wanting to ‘drill down’ and explore. For instance, I see this big spike, I click on it and get a list of contributors, but I can’t click on the names to see what the contributions are.
  • Lines are a little “noisy”. This may just be an artifact of people working in bursts. However, when looking at the chart, I wonder if a more averaged/smoothed line would make more sense, from an aesthetic but also maybe accuracy perspective?
  • Related to above, you say “older edges have their weight decayed exponentially”. I have an urge to have a slider/knob that allows me to change the weight (power?) of the exponential, adding more or less weight to past contributions (presumably). This could produce the “smoothing” I want to try, and possibly surface new patterns/insights.
  • If there are more than 2-3 contributors, the lines become hard to read. Perhaps have a way to toggle on/off certain contributors so you can see lines clearly, make comparisons that are more meaningful (e.g. let’s look at the top 3, or these certain people I’m interested in). Coinmetrics does this well with their charts.

Personally, I think it would be cool to have a few parameters exposed in the UI that I can tweak (e.g. the exponential weighting to smooth/weight older contributions more or less; maybe a select number of weights in the weight configurations). Then have the chart react in real-time (or close enough). This could allow me to explore the very large “solution space”, perhaps quite efficiently, with the goal of nudging the graph until it reflects some reality or pattern I’m trying to analyze.

There could also be some value in ‘content discovery’. E.g. being able to click on a spike in activity, and see what contributions created that spike.

This is also making me realize that in addition to this view, it would be interesting to view total cred evolving over time. But that is perhaps for another issue.

3 Likes

FYI, when I input a very large repo with a long (possibly complicated) history, the timeline view is just hanging…E.g. this repo, which dates back to 2015, and was created from a fork of a project that goes back further (if that matters). https://s-ben.github.io/dcrd-sourcecred-demo/site/timeline/decred/dcrd/

Presumably this is just the graph choking. The regular ranking page is working fine, https://s-ben.github.io/decred-meta-sourcecred-demo/site/prototype/decred/decred-meta/

To make sure I understand this, essentially each week sourcecred is run and there is a recency bias creating an cred distribution for that week, and the cred listed bellow the chart is the “accumulated” cred over the entire history?

I agree that it seems useful to be able to modulate the decay related to older edges. Being able to balance early contributions that help bootstrap projects with ongoing contributions feels important.

Things like this I think will be helpful in understanding how the weights are impacting distribution and that might help communities to decided what those weights ought to be.

2 Likes

Yep, that’s basically correct.

Yeah, I agree, just haven’t put the slider in yet!

This is another thing on my bucket list :slight_smile: Specifically being able to select a region of the graph, and then see the cred table update to just show activity during that time period.