SourceCred Codebase Walkthrough

Description


The SourceCred Codebase Walkthrough is the latest and greatest overview of the technical architecture of SourceCred. It will help you understand the context and thinking that went into design decisions around the architecture of the SourceCred codebase.


Note: it is expected that this Artifact will be updated as the codebase is updated. All contributions will be listed in the Contributions section, however the latest version of the SourceCred Codebase Walkthrough will be listed above in the Description section of this Artifact.

Contributions



References



1 Like

@burrrata, what do you think is the best format for the codebase walkthrough? Should I record a new video perhaps?

1 Like

An up to date video would be awesome!

To start, there’s the “What is SourceCred?” question. To answer this there could be a 5min overview/intro, but the focus here should be on the code. Then we can then have separate videos and presentations talking about SourceCred from a high level.

Then the first thing we want to figure out how to install SourceCred, how to deploy SourceCred, and how to hack on SourceCred. This will involve walking through verifying the dependencies and installing a fresh build of SourceCred. Hopefully this is quick and for further details people should refer to the README. Ideally this would allow someone to follow along with the video on their computer if they wanted.

Then we need to actually dive into the code. At the moment I see a few main parts:

  • Foundational mechanisms that should “just work” such as the Markov chain and PageRank algorithms.
  • SourceCred data structure components such as how the nodes and edges are created, referenced, and updated.
  • Plugins to connect data into SourceCred.
  • UI to see and interact with the underlying data.

These could be explored for 10 minutes each. Even though people won’t be using the foundational mechanisms it’s important (or at least interesting) to know what they are and how the other components interact with them. Then most of the time can be spent showing devs how data is structured, how state is created, and how to create plugins to modify/create that state. Then we can explore how to visualize and interact with that data, and how one might improve that aspect of the codebase too.

Overall it would be great if the video was 45-60 minutes:

  • 5 min SourceCred intro
  • 5 min dev env setup
  • 10 min foundational mechanisms
  • 10 min data structures
  • 10 min plugins overview
  • 10 min UI overview
  • 5 min wrap up

In an ideal world we could do one of these every month so that there’s always an up to date fresh version that devs can watch to understand the codebase as well as how to contribute to the things that need work (plugins, UI, etc). Even just doing this every once in a while, however, would be great. That way someone can easily understand the system, how the pieces connect, and how they might contribute.

These are just a few ideas, but it would be great to get input from more developers too! :slight_smile:

Also, @decentralion, do you happen to have a template/overview of your dev environment as well?

1 Like

I love this idea. @burrrata would you be willing to champion it? If you do, I will be at your disposal, in terms of doing the walkthrough itself. However I don’t have the bandwidth to organize it, figure out how to best distribute the result, etc.

I’m down with the idea of doing this on a monthly basis so we keep this artifact up to date.

I’d be happy to help however I can. I think I can contribute the most value by helping to organize a more detailed outline of what the video might look like. Then after we shoot the video I could turn the walk through into a technical overview document as well as READMEs for each main section of the codebase. This way we would have an up to date a video walk through as well really basic docs for the main components of the codebase. This wouldn’t be super detailed, but at least it would help people get started exploring and hacking on the codebase.

It should be noted, however, that at the moment I do not have any video editing or production expertise. I could help design, document, and share the walk through, but you (or someone) would have to actually make the video.

On my end I this might involve 8 - 13 hrs:

  • 3-5 hours reading through the codebase to try to make sense of everything and then organize that into an outline of the core components.
  • 3-5 hours compiling the actual content from the outline and video into a technical overview document as well as READMEs for the main sections of the codebase
  • 2-3 hours repackaging the content into a format that can be shared across YouTube, Twitter, Discourse, etc…

On your end this might involve 2.5 - 5 hrs:

  • 30-60 min answering questions I inevitably will have about the codebase
  • 30-60 min reviewing and enhancing the video outline
  • 60 min filming the SourceCred Codebase Walkthrough video
  • 30-60 min editing the SourceCred Codebase Walkthrough video (can skip if no editing is required)
  • 30-60 min reviewing the technical overview and README docs to make sure they are intuitive and accurate

Does that kind of sound like what you had in mind?

Personally I’d much prefer a series of videos rather than a monolithic one that cannot be readily digested or easily updated.

  • Why Sourcecred - a separate presentation, targeting a wider audience.
  • Why and how a dev would wanna join the codebase part of the Sourcecred project is a topic interesting to me. Maybe cover some history and vision and culture, some low impact ways to get involved in testing or porting, some TODOs or wishlists.
  • Instructions on how to install / setup / build / launch I personally much prefer as a text description with command sequences that can be copied, and do find Install videos a waste of my time, particularly since the steps or the appearance could be different on my system, and those invariably become obsolete.
  • Specific application of PageRank, in conjunction with the constructed graph and weights, is a worthwhile topic to cover in detail, for those interested in tuning, optimizing or adding protections against gaming the system.
  • Those looking to extend the capabilities to other data sources, like Discord or Telegram or whatever, will need an overview of the plugin architecture and some hints or recipes. Whether this type of information is conducive to a video presentation is unclear to me, as covering such topics typically starts with a diagram and in the end it comes down to API usage examples.
  • One area I’m interested in is constructing a graph out of a collection of repos corresponding to a project or an effort, with combined accounting, which is orthogonal to a new data plugin. BTW, how should a cloned repo, with extensive new work, be accounted for, when combined with the upstream one(s)?