SourceCred as an incentive compiler

Copying these from the The power of knowledge graphs thread. SourceCred. Would be awesome to have a blog post on SourceCred’s amazing ability to align incentives in a positive sum way :slight_smile:

1 Like

Always thought the incentive compiler meme was strong. Will appeal esp to anyone that’s done development.

I would say that SourceCred 1.0 (the scores within a repo/Discourse) are like a working low-level language compiler (i.e. assembly language->101000101). We’re now working on high-level compilers (i.e. C+±>101000101). High-level compilers are much harder, but more widely applicable.

Speaking as someone who loves writing compilers, I’m intrigued by this idea, and would like to probe it. Please, let me invite you to my mental model.

A language is an assignment of semantics to syntax. A compiler is a transformation from syntax in one language to syntax in another (maybe the same) language that preserves the semantics.

With this framing, a bunch of common tools are seen to be special cases of compilers:

  • An optimizer takes one program and spits out a new one with the same behavior (semantics), but that generally runs faster than the original program.
  • An obfuscator is a compiler whose output is harder to understand than its input.
  • A minifier is a compiler whose output is smaller than its input.
  • A prettifier is a compiler whose output is more consistently formatted than its input, and maybe more aesthetically pleasing.
  • An assembler is simply a compiler whose output language is called “assembly”, and likewise for a disassembler.
  • A decompiler is just a compiler.

But note that I haven’t said anything about programming languages. As formulated, this applies to natural languages, too, and it applies to languages that aren’t programming languages or natural languages. The GraphQL Mirror module has a method extract that transforms a subset of a SQL database into a JavaScript object graph, preserving the associated semantics at each end. The TensorBoard data_compat module exposes a function that transforms “v1-style summaries” to “v2-style summaries” while preserving their semantics. These are compilers, and this generalization is critical.

A while ago, on the topic of whether various interactions constitute transactions, @decentralion said:

This is an astute point, and I’d like to refine it a bit. Let’s move up a level. A definition is useful if you can reason about generic instances of the term and apply that reasoning to specific instances. An instance of the definition is useful if the high-level reasoning transfers faithfully and provides insight.

For instance, the above definition of compiler is useful because we can make the general statement that “compilers compose”: if you have a correct compiler from A to B and a correct compiler from B to C, then you also have a correct compiler from A to C, and you know exactly what it does. The classifications of, say, obfuscators and optimizers as compilers are both useful because we now know what happens if you take a program, run it through an optimizer, and then run that through an obfuscator: you have a semantically equivalent program that likely runs faster and is harder to read, because compilers compose.

Personally, I derive great value from this broad definition of compiler. When I want to write a compiler, I know how to structure its internals, how to test its correctness, what it even means for it to be correct, how it should interface with outside systems. In fact, I find this conceptual framing so useful that whenever I’m working, I’m generally running a low-priority background thread called “find the compiler”, in which I try to identify how the problem that I’m trying to solve can be expressed in this framework, after which it becomes basically a “known problem” whose solution I just have to implement.

So: Is SourceCred an incentive compiler? That depends. What language is it compiling from, and what’s it compiling to? What semantics is it preserving, and what non-semantic transformations is it making, if any? And how does this help us reason about what it is or should be?

Some references for further reading:

(Exercise for the interested reader: In the context of programming, what are “compiled languages” and “interpreted languages”?)


I love taking metaphors until they break :smiling_imp:

I was definitely thinking of it in the most common definition of the word, which is as an assembler. I.e. taking a high-level desired behavior and ‘compiling’ that into more granular, lower-level incentives that aim to generate, in aggregate, the desired low-level behavior. For the core algo, which actually generates scores that are usually meaningful enough to be actionable (though typically only within a single repo/Discourse instance, as activity is different in nature across repos), it’s a bit fuzzy. Because the semantics are fuzzy. We’re looking at what’s been valued in the past and saying, “More of that please…”. It remains to be seen if we’ll actually get more of that in the wild, but let’s say for the sake of argument that it works well enough. This seems more like AI almost, which learns a model by being fed “successful” data. It “compiles” code much the same way it can generate target images using other images.

I suppose one can also say that new Issues, PRs, etc. are a more specific set of semantics guiding behavior. In this case the language is English (or other spoken language). But also, in my experience in OSS, the language of doing. It is concrete actions, guided by Issues, etc., or self created but still valuable contributinos, that generates :heart:s, PR reviews, and other actions that confer meaningful cred, driving the incentives.

When Initiatives come online, they will act more like high-level programming languages. Giving more specific semantics (desired behaviors).

These seem kinda the same, as both compile high-level languages to low-level languages. One just does it ahead of time in a more optimal manner, and the other does it on the fly. Maybe Initiatives are more like compiled languages, and the base SC algorithm is like a run-time compiler, the other “developers” entering “commands” in the form of Issues, comments, likes, etc.

1 Like


I think that there is an interpretation of SourceCred as an incentive compiler that’s compatible with @wchargin’s framing of a compiler as a translator from one language’s syntax to another, but it’s really fuzzy.

In this case,

  • the preserved semantics are all statements about who deserves credit
  • the input syntax/language is in terms of actions and weights
  • the output syntax/language is some quantified values

There’s definitely some generative effects that arise, and the conversion definitely changes the information contents. I think that the fact that we would consider them features, not bugs, makes it clear that it’s not intended to be a compiler in the category theoretic sense. It almost looks more like an interpreter than a compiler at this level, but it’s an interesting perspective to consider.

I like the imagery of ‘incentive compilier’ but i concur on its mismatch with the technical concepts as outlined by @wchargin. In my opinion SourceCred is a textbook example the kind of multiscale feedback process described below. I think finding a useful short hand that brings to mind this dynamic intuitively without requiring this level of depth is important and i think the concept “incentive compiler” is actually a step in the right direction. With that in mind, here is a distillation of my research on the subject in hopes that this community might be stimulated to arrive at a highly compressed analogy that holds up when held to the fire.

*Economic systems are often observed to have properties that are not directly attributable to the agents, processes and policies that make up the economic system. Understanding the emergent properties as arising from relationships between the agents, processes and policies requires the multiscale perspective. Through a synthesis of these perspectives on multi-scale systems, a basic formula for framing practical economic models is shown in Figure 3. Any model requires assumptions about the properties of its constituent parts and assumptions about the environment or larger system in which the model is embedded. Couched in economic terms the model of the larger system provides macro-economic context and the models of the constituent parts provide micro-economic foundations.

Applying a multiscale perspective to economic systems is not a new idea. It has been addressed implicitly by representatives of the Austrian School of Economics, and also other heterodox economic schools including Complexity Economics [Foster 2005], [Montuori 2005], [Bateson et al. 1989] and Ecological Economics [Common and Stagl 2005], [Schumacher 2011]. While Ecological Economics was originally motivated by ecology rather than systems theory, it also criticized the failings of the orthodox economic canon in addressing the complex dynamics that arise when there are interaction effects between parts and wholes with special attention to human activity as being a part of the natural world. A recently yet widely accepted idea in macroeconomics, the Lucas Critique [Lucas 1976] explicitly addresses feedback effects between micro and macro scale behavior. The need for multiscale representations is further borne out in Evolutionary Economics [Dopfer, Foster and Potts 2004] and in the standard practice of systems engineering [Hamelin, Walden and Krueger 2010].

Through the multiscale perspective, it is possible to study interscale phenomena such as emergence as shown in Figure 4. “Emergence (…) refers to the arising of novel and coherent structures, patterns and properties during the process of self organization in complex systems. Emergent phenomena are conceptualized as occurring on the macro level in contrast to the micro level components and processes out of which they arise.” [Goldstein 1999].

Emergence closes the feedback loop of the macro, meso and micro level activities where policy makers measure phenomena on a macro level, decide over new policies on a meso level, and implement these policies impacting agent behavior a micro level, which in turn result in systemic effects that can only be measured on a macro level.* ~ Voshmgir, Zargham [Accepted for MIT CryptoEconomic Systems Confernce]

I realize that this is long winded but its been an ongoing challenge to distill and frame the complex interaction between the

  • observed macro properties that we care about as system designers and stewards
  • imposed meso policies that we define to facilitate, constrain and incentivize activity of participants
  • individual micro or individual agent perceptions, trade-offs, decisions and outcomes

The image @s_ben shared of transfer learning really resonated with me. What it called to mind was a process which combined a desired outcomes (the artistic style, or in my example above the macro observe policies) with something inherently individual (@wchargin) to create a something new which was in essence both. What was under the rug is some highly complex process which achieved the fusion. In a sense we are defining that highly complex process which is capable of fusing not just one individual with the desired style but many, and furthermore, the interactions of those individuals with the process and with the outcome are non-trivial. To carry the art example, its as if we have unknown (and changing mass of individual inputs) and we wish to have the out as a mosaic which adheres to some desired properties at but the individual pictures and the composite artwork, even as the inputs change over time.

If this line of inquiry is interesting to you, I strongly suggesting reading more of the work of Jason Potts.

1 Like