Refactoring the load system

Beanow · February 5, 2020, 9:59am

Description

The existing load system was quite complex, which made it hard to change. And it needed to change, in order to support Unified reference detection

This Initiative is about biting the bullet to break the existing load system into smaller parts that aren’t as difficult to change.

Status: Completed

The refactor itself has been implemented! Switch to LoadContext based loading

Champion

@Beanow

Benefits

Divides and conquers the complexity of the load system.
More robust unit tests.
Easier to implement changes in the future.
(better tests = more confidence, less complexity = less headaches for devs)
Many implicit features of the load system become explicit.
For example, we now have a definition of the “SourceCred data directory” we didn’t have before.

Implementation Plan

Apply head to keyboard

Deliverables

A design document which outlines the new concepts we’ll use to break the load system into smaller parts. And plenty of PRs to implement the refactor.

Dependencies

None at the moment.

References

Discussion / trackers

Related refactors, used for reference

Contributions

Design doc

Load refactor design doc #1586

LoadPlan version (concept rejected)

LoadContext version (concept accepted)

Prerequisite refactoring

Cleanup

Beanow · February 5, 2020, 10:26am

@decentralion can you give this a look to see if it makes sense? I’ve included some of your related work. Though not all, as WeightedGraph could be it’s own Initiative / Artifact.

Some I’m directly using, so they’re contributions:

Some are replaced but provided input. I’ve opted for references here, because they were intended to support WeightedGraph, not directly to address the complexity of load:

Beanow · February 5, 2020, 2:12pm

Fun fact, if you’re wondering why so many contributions. Some of the first work started way back in mid-December. It’s been a loooong time coming

decentralion · February 5, 2020, 5:39pm

Looks good to me.

s_ben · February 5, 2020, 5:56pm

Wow. So much work!

Selfish question: will this allow me to run SourceCred on /Bitcoin and other old, messy GitHub repos with lots of dirty data that SourceCred currently chokes on?

decentralion · February 5, 2020, 7:04pm

No, although @wchargin is working on a change that likely will. See: GraphQL Mirror fidelity awareness

s_ben · February 5, 2020, 7:18pm

Oooooh. 'd Initiative for what it’s worth.

Beanow · February 28, 2020, 12:52pm

Added: Improve test readability of loadContext and pluginLoaders

Beanow · February 28, 2020, 3:24pm

Believe this is now completed!

decentralion · February 29, 2020, 2:56am

Yeah, I think this is completed. Although based on my experience hacking on CredRank during ETHDenver, I’m worried that the new load system goes overboard on DI in a way that is frustrating/has lots of boilerplate for making changes. So I think we may still have more iterations to arrive at a system that feels like a good balance of testability but also legibility / ease of working with. (cc @wchargin)

Beanow · February 29, 2020, 1:48pm

Perhaps a good moment to share some retrospect thoughts :]

Agree there’s frustration/boilerplate left. However, this initiative/refactor started because the previous version was impossible spaghetti, where all the tests would break if you change how the pipeline itself worked. (Such as breaking up the mirror+graph step into two separate steps).

Introducing such pipeline changes have gone from nightmare to doable now, for example Expose a ReferenceDetector to createPluginGraphs made a change in the pipeline.

Something that I feel became more effortful is having to declare a Loader interface and object for every plugin. While more work, I think it’s an improvement over the before-situation of directly depending on plugin-internal functionality in the load pipeline.

Another trade-off that creates more effort, is how LoadContext is required to be a “dumb” composition class, not allowed to implement any logic. It makes for way more robust and easier testing, but is also an extra step and not the most intuitive abstraction.

A main source of frustration I experience still (before the refactor as well) is how configuration values propagate in multiple formats throughout the entire system. From src/cli to src/core to src/backend to src/plugins.

Having the instance system, reworked CLI and a better configuration abstraction, I think will greatly reduce code churn here.

Possibly some of these points will be included in Productivity / Quality of Life for instance maintainers. However:

This initiative is to save time and make it easier to maintain a SourceCred instance.

I’ve deliberately not included easier / time saving for developers here. So if these further refactors are time consuming and don’t help instance maintainers much, I’m inclined to punt on it for a while.

Topic		Replies	Views
Unified reference detection Initiatives 🏗️in-progress	10	2277	February 28, 2020
GraphQL Mirror fidelity awareness Initiatives ✔️completed	15	2089	March 1, 2020
Summary of planning chat Community	3	1539	October 20, 2019
Write the Initiatives Plugin Initiatives 🏗️in-progress	16	3075	February 29, 2020
Changing the CredSperiment weights The CredSperiment	20	4395	January 7, 2020