At the data layer, we should scrape everything. We may have a plugin config option to assign special meanings to special channels. Generally however, a valuable contribution can happen in any channel, so I think we should be scanning the whole server.
At the graph creation layer, I think that for now, creating a node and edges for every single message would be too much noise, especially since it’s not obvious what the edge structure between messages should be. Instead, let’s create nodes only for messages that receive reactions with the following structure:
Nodes
- message
- user
- reaction
Edges
- user → creates → reaction
- user → posts → message
- reaction → reacts_to → message
- message → references → (referencable)
Yep, the fact that we’ll get references is a feature not a bug – it means, for example, that if my posting the PR in didathing results in a lot of reactions, then some more cred will flow to the PR.
This will be feasible thanks to @Beanow’s work on Unified Reference Detection.