SourceCred

Odyssey "Manual Mode" Brainstorming

So far, SourceCred has depended on GitHub for its data—it considers every issue, pull request, and comment to be a contribution, and assigns scores to these based on how they interrelate. It’s a powerful approach, but it’s missing the human context.

What if we use this hackathon to explore a different approach: one which is focused, first and foremost, on user experience, and soliciting informtaion from contributors. I’m imagining a “SourceCred editor” which allows you to define the cred in a project in terms of “Components” (higher level pieces of the project, may be nested), “Contributions” (individual pieces of work for one or more components), and “Creators” (who create the contributions).

For example, let’s suppose we want to compute cred for a dance party that @LB threw at our house a few months ago. We could start with the Components:

  • Venue
    • Preparing the house
    • Decorations
      • Design + theme
      • Buying supplies
      • Putitng up decorations
  • Food
    • Buying food supplies
    • Cooking + Preparing
    • Cleaning kitchen afterwards
  • Music
    • Sound system setup
    • DJs / sets
  • Organization
    • Planning
    • Coordination
    • Outreach
      • Make FB post
      • Word of mouth invites

(Note: I’ve represented these like a tree, but it might be a general graph, e.g. “Sound system setup” might be connected to the Venue component as well as the Music component.)

Once components exist, it’s time to add “Contributions”. Contributions each have a description, and associated people. For example, I could add two contributions to the “Preparing the house” component:

  • “Writing note to neighbors to let them know about the party” by @LB
  • “Delivering notes to each neighbor” by @LB + @RyanLimbaugh
  • “Re-arrange the living room to be a dance space” by @Dandelion + @Miguel

Components and contributions can also have dependencies on each other. For example, if someone posted an image of the decorations to the Facebook event, that could be included in the “outreach” component, but depend on the “decorations” component.

Since creating these contributions, writing good descriptions for them, etc, is a form of labor, we should also have a “cred-tracking” component that captures the meta-contribution of keeping the contribution graph up to date.

On top of this, we can add weights on the edges, so that you can say e.g.:

  • Setting up the dance floor gets 2x the weight as delivering the notes
  • The “Venue” component was more important than the “Food” component

We could also imagine the relative weight of each component just being derived from the number of contributions in scope. That might work, but would create perverse incentives to create as many tiny contributions as possible, e.g. “Cleaned the garage” becomes “cleaned the top shelf of the garage, cleaned the middle shelf of the garage, …”. Having some weight mechanism gives a good way to deal with that.

Then, we should be able to do cred analysis on a component-by-component basis.

  • For any component: which people have cred in that component?
  • For any user: what components do they have cred in?

If we can make some pretty charts for each component, that would be quite interesting!

Naturally, this system should work on software projects too, and interface through the plugin APIs with the rest of SourceCred. For example, the component tree for SourceCred might look like this:

implementation/
  graph/
    api/
    implementation/
    testing/
    documentation/
  plugins/
    git/
      git-loader/
      graph-generator/
      testing/
    github/
      data-loading/
        mirror-module/
      graph-generator/
      reference-parser/
      testing/
  pagerank/
    api/
    performance/
    testing/
    documentation/
  cred-explorer/
    design/
      score-normalization/
    implementation/
      features/
        cred-aggregation/
        node-descriptions/
        weight-configuration/
    testing/
research/
project-management/
  ideation/ (coming up with SourceCred to begin with)
  goal-setting/
community-building/
  discord/
  discourse/
  word-of-mouth-outreach/
  blogging/
  videos/
resourcing/

Then, besides manually adding contributions for the sections (as described above), we could also “link in” existing contributions. We could have a UI that displays every GitHub issue and PR that doesn’t correspond to any component, and then tag it with one (or more) components.

If this works, this would solve a lot of problems for SC:

  • Makes it possible to recognize non-GitHub contributions
  • Provides overall context so maintainers can signal how each contribution fits into the project
  • Makes the system more spam resistant (the spam/trivial contributions won’t get connected to components)
  • Makes cred scores a lot more interesting (“you have 30 cred in the API component and 90 in the research component; within the research component, 15 came from infrastructure and 65 came from algorithms” is far more interesting than “you have 120 cred”).

If we decide to go down this route, some interesting challenges are:

  • How to make a UI that makes managing and updating this data structure intuitive?
    • Do we use a graph visualization? A text based system? Some other GUI?
    • How do users re-organize the graph, or split one component up, or group a set of related contributions into a new component?
    • How do people collaborate on the UI?

There are also a lot of other neat things we could think about doing, like adding “types of labor” (implementation, communication, logistics, design, research, …) and then tagging each contribution with the applicable labor types. Then in addition to giving a component-wise view of peoples’ work, you can also see what kind of work they’ve been doing.

1 Like

From a UI perspective – I’d like this to be accessible to non-techies. Making categories and adding contributions to them should feel straightforward, like using any other modern web app (e.g. Discourse). I think with the awesome UX talent we have on our team this won’t be hard. :slight_smile:

A goal for measurable success here: does the manual mode wind up being actively used for projects that are not GitHub open-source projects? (E.g. for recognizing the organizers of an event or people who worked on a research project.)

I like the data model and emphasis on accessible UI. This is difficult for a couple reasons:

a) As you mention, keeping track of all these contributions is labor intensive. Asking people to do a bunch of data entry after a party, down to logging who cleared the dance floor, seems unlikely…Passively tracking people would require surveillance people might not be comfortable with (or not?), and force you to create metrics derived from that data, which may not fit neatly into this common sense data model (which I think is good).

b) Price discovery. In PageRank, determining cred is determined from the interactions. Here, tasks are valued subjectively in relation to each other (e.g. “Setting up the dance floor gets 2x the weight as delivering the notes”). That is very difficult to do non arbitrarily. Who gets to decide what cred weights are assigned certain tasks?

The idea of a “SourceCred editor” makes sense. Taps into the way humans naturally give credit. For instance, for any event big enough to require organizing, there is typically an organizer person that steps into that role (or is “volunteered”). This person almost always in some form publicly thanks various people that helped. At the end of an event, in an email afterwards, etc. Since that natural labor is already going on, it could perhaps simply be captured and input into the system. Perhaps by the organizer. Or perhaps by a “SourceCred” editor, who inputs cred manually on an ongoing basis (this could be a task given to a “community manager” type role). Some crypto projects are doing something similar to great effect with community run tip bots. Though beware, this is the digital meth that fuels the #XRP Army. One would have to be careful with the intentions and goals.

As for price discovery, I imagine one could create “set rates” for different types of contributions. Perhaps just use market prices for those services as a starting point (though you’d risk just porting a chunk of inequality into the new system). One thing I see working is easy, intuitive ways to express appreciation/valuing. E.g. the claps on Medium pieces. Or perhaps emoji reactions on slack.

Another way to come at this, is putting the burden of valuation on the contributors. In my project, each contractor has to create their own invoices every month, with very little guidance (sometimes none) on what is billable and not. This requires an ongoing, honest evaluation of one’s contributions (and results in fairly frugal billing). One could have a system where anyone, through a simple UI, could submit a claim for cred. Putting a number on it themselves (obviously they are given some context here so they have a unit of measure). The community (or SourceCred Editor) can then approve or deny (or adjust).

I agree, it would be quite labor intensive. Realistically, I don’t think this system will actually be used by people doing meatspace activities like throwing a dance party (although it’s a cool proof of concept to show how it would apply). Domains like open-source are a great fit because all of the activity gets logged passively (e.g. git log) so rather than having humans do the logging, humans just need to organize the contributions into higher-level categories.

E.g. if SC has defined 20 components, then for each PR I can select the right component. This would be way easier than having each person need to log each contribution since you can batch process them (and categorization is easier than generation).

I imagine the system still using PageRank to assign values to the scores, based on dependencies/interconnections. The weights here are indeed (inter)subjective and setting them appropriately will be hard. In the case of the dance party where there isn’t a rich dependency structure between nodes, most of the score comes from the weights so it’s very important. For open-source projects, hopefully the weights should be a bit less pivotal – contributions’ scores are more driven by how they were depended on rather than on the weight on the contribution.

I’m not very familiar with the XRP community. I would like to avoid creating digital-meth-fueled-internet-armies, legions of financially-motivated trolls is not the legacy I’d like to have. Can you elaborate a bit more on the XRP army and what the lessons learned are? (cc/ @anthrocypher)

Yeah, I’d rather that people get rewarded based on the value their contribution created, rather than the replacement cost of what someone else would have been willing to do it for. This might lead to some weird supply/demand dynamics (what if the value of fixing a bug in SC is $10,000, but it only takes 2 hours of work to fix it – will there be a huge rush of people trying to fix it, with people trying to bribe the maintainer to accept their patch?). But rewarding people based on value rather than replaceability is the right principle to start from, I think.

Letting people submit their own claims will be a part of the puzzle, for sure. However, I’d rather this not be the main mode by which people get rewarded, because not everyone is equally comfortable with self-advocacy, and it will tend to favor people who are aggressive in seeking rewards.

To mitigate this, I’d like to have a culture of appreciation where people are proactive in adding entries to the graph recognizing the work that others’ have done. There could even be a small incentive for this (if you add a node that is validated and flows 1 cred to someone else, you get 0.05 cred yourself for the work of documenting the work). Of course, this also creates the risk of cliques/upvote circle type dynamics. In the short term, needing all new contributions to go through some kind of review process should mitigate the gaming.

Domains like open-source are a great fit because all of the activity gets logged passively (e.g. git log) so rather than having humans do the logging, humans just need to organize the contributions into higher-level categories.

I’m torn here. I think just limiting the scope to open-source development activity (anything in GitHub), is smart. I’m wary of scope creep, and stretching already thin bandwidth. At the same time, this presents an opportunity to capture non-coding activities, which will improve SC’s overall social mission, and possibly make it more robust/valuable in the future - not to mention avoid possible criticism that it’s just increasing the inequality between code and non-code contributions (which it might very well)).

E.g. if SC has defined 20 components, then for each PR I can select the right component. This would be way easier than having each person need to log each contribution since you can batch process them (and categorization is easier than generation).

Having a limited (but expressive) number of categories with “standard” cred “rates” is an efficient, doable starting point. This is the dominant, proven model in our current labor markets. E.g. organizing a meetup gives you a standard 20 cred (or X% of total cred distributed this month, etc.). Then get it in GitHub in a way that can be plugged into the PageRank somehow.

For open-source projects, hopefully the weights should be a bit less pivotal – contributions’ scores are more driven by how they were depended on rather than on the weight on the contribution.

This is the larger problem in incorporating “meatspace” contributions IMO. Organizing a meetup could cause a connection to be made which benefits the project more than any single code commit. Perhaps this comes in via higher-level elements I’ve skimmed but not looked into. For example, a person is a node, whose contributions are then tracked. If a meetup organizer’s node could claim credit linking them to that other node, they could get cred flowing back to them over time. Similar to how when recruiters place a contract employee, they get like %15 of that person’s salary for a year (not an ideal thing to replicate, but this is a real world example, so something to learn from).

I’m not very familiar with the XRP community. I would like to avoid creating digital-meth-fueled-internet-armies, legions of financially-motivated trolls is not the legacy I’d like to have. Can you elaborate a bit more on the XRP army and what the lessons learned are? (cc/ @anthrocypher )

So, I just spent like 10 minutes googling around to see if Ripple Labs (or associated entity) is directly funding the tipp bots (or paying people to tip). Couldn’t find anything but pro-XRP pieces on the first two pages of google. That is terrifying. Ripple is essentially the most effective marketing machine/MLM 2.0 I’ve ever seen. It appears they are also funding further development of tipping bots. The guy that created the initial XRP tip bot just got funded by Ripple to work on it full-time (his dream come true). A couple days ago they announced the tip bot now has a “recurring tip” feature using micropayments. That’s fucking brilliant. It could create essentially a decentralized Netflix of shills, possibly sustaining top “creators” enough to do it part-time or even full-time. If Ripple is funneling enough XRP to XRP army “lueteninets”, which scour Twitter/Reddit/etc. rewarding pro XRP content to encourage/recruit new soldiers (which I believe they’re doing), they essentially create something akin to what SourceCred is doing. Just with creating marketing content, not contributions. Add recurring micropayments between those “community” members, and the thing gets stickier….

Another example I know more about is DASH’s tipping bot. This we have more insight to, because funding for it was done publicly via on-chain voting. From an article on DASH’s governance, from the original proposal to create “DASH Force” (again the military metaphors),

“Dash Force members will tip community members, core team members and slack users/mods who are quick to answer questions in the Slack channels, The DASH forum, BTCtalk and Reddit etc. People will also be tipped for posting links to positive/negative threads and articles so we can quickly organize swarms to go upvote/downvote and comment. Dash force members will organize all this and take the lead to get everything started.”

This was effective enough that DASH Force was refunded in several subsequent proposals, and I believe is still in operation throughout the bear market (gotta prop up that currency that’s not actually being used somehow).

I am cautious to use these ideas, as Ripple and DASH are gross marketing operations that could collapse and burn a bunch of people, due to lack of actual use of the currency. However, it does present a fascinating, large-scale, proven system for crowdsourced valuation of work. In this case, it’s rewarding shills. But the same architecture could be used to empower community members to go around valuing more substantial work, creating community. Imagine the “SourceCred Editor” has a tip bot with a discretionary budget, funded by the SC treasury, and can, in the normal course of community building efforts, “high-five”/pay members in more robust (and perhaps offline) ways. E.g. community member gives a “shoutout” Tweet @‘ing someone (or a list of people) for helping organize a dance party. Cred flows to those people. Perhaps at the same time the bot creates a GitHub Issue or PR to log it into the cred in the system (automating the creation of GitHub elements is fairly easy with their API). This person (or persons) is also empowered to manually create higher-level links (publicly and with other people able to comment/veto) in GitHub.

Yeah, I’d rather that people get rewarded based on the value their contribution created, rather than the replacement cost of what someone else would have been willing to do it for. This might lead to some weird supply/demand dynamics…

I think the SC algo already does a good job of valuing/rewarding bug fixes, code, etc. I’m more referring to meatspace activities currently not captured.

Letting people submit their own claims will be a part of the puzzle, for sure. However, I’d rather this not be the main mode by which people get rewarded, because not everyone is equally comfortable with self-advocacy, and it will tend to favor people who are aggressive in seeking rewards.

I agree it shouldn’t be the only (or primary) input. However, I think some mechanism allowing people to claim credit is good. In part just because, in these complex systems, sometimes the only people that can accurately value the contributions is the person doing them. There simply isn’t enough visibility/bandwidth for other humans (even backed by algos) to catch everything. Again, I think this is more for non-code contributions. The genius of SC is that it credits contributions that show up in GitHub objectively in a very effective/fair way (and can be made more fair via customization for the project).

To mitigate this, I’d like to have a culture of appreciation where people are proactive in adding entries to the graph recognizing the work that others’ have done.

Yes. See tip bot argument above.

Of course, this also creates the risk of cliques/upvote circle type dynamics. In the short term, needing all new contributions to go through some kind of review process should mitigate the gaming.

An important part of this will be some layer of review, presumably by humans. This introduces messy politics and gaming, and possibly a governance layer if the humber of humans involved gets beyond 50 or so. But I think that’s going to be needed anyway just to manage parameters/development of a SC instance. But again, SC is good enough as is to start directing low-level funds IMO.