Proposal: add GitHub PR labels

Our PRs can accumulate and get a bit difficult to scan through.

To help with that I think we could use some (colored) labels.

These are the states I think a PR can be in:

  • Needs review
  • Reviewed, but holding off with a merge because this PR is part of a set and feedback from other PRs might cause changes in this one.
  • Ready to merge

The 3rd one I don’t know if we need a label for, because it would be better to just merge it then instead of apply the label :smile:

But the 1st and 2nd would certainly help if they are colored labels.

What are your thoughts on this and do you have a suggestion for a term for the 2nd state?

(Pinging @decentralion @wchargin @vsoch as regular GitHub contributors.)

Here’s my mental model of the commit status state space, which is actually two dimensional (review state and dependency state).

Primer on how we develop

In SourceCred, we develop at the level of semantically atomic commits (see CONTRIBUTING.md). This means that for a particular feature, we might have 10+ small commits, each of which makes a small, easily understood change to the codebase.

As a review+merge workflow, we put every commit into its own branch, and then create a sequence of PRs merging the small commits together. Like:

PR1: Merge branch has-commit-1 into master
PR2: Merge branch has-commit-2 into has-commit-1
PR3: Merge branch has-commit-3 into has-commit-2

Once every PR in the sequence is approved, we merge them together, with the following steps:

Merge PR1
Retarget PR2 to merge to master
Merge PR2
Retarget PR3 to merge to master
Merge PR3

In this example, we need to finish reviewing PR3 before we will merge PR1 or PR2. Hence, we tend to have an accumulation of PRs, and we need some way to organize + understand where they’re at, at a glance.

Understanding the state space

Rather than proposing a set of labels, I think we should try to first understand the (abstract) state space that commits are embedded in. Then, if we decide adding labels is the right approach, we can define a mapping from states to GitHub labels.

Review State

Needs Review

This commit is waiting for a code review. (Maybe it hasn’t been reviewed, or maybe it’s been reviewed but still needs feedback from a particular domain expert.) Generally, the commit should enter this state whenever its code changes.

Needs Update

This commit has been reviewed, and the reviewer wants some changes to the underlying commit.

Approved

This commit code has been reviewed and approved by all needed reviewers.

Readiness State

Non-ready Parents

This commit can’t yet merge, because its parent commit(s) aren’t yet ready. For example, they may not be approved yet.

Non-ready Children

This commit can’t yet merge, because its children commit(s) aren’t yet ready. For example, we may still be figuring out exactly how the features that depend on this commit will work, and we may want to tweak this commit as we discover new constraints.

Ready

This commit is ready. This means that every commit in this commit’s scope (parents and children) are approved and are also ready. At this point, we can merge this sequence of commits (starting with the oldest parent and proceeding one-by-one to the youngest child).

We could come up with GitHub PR labels, but I think it will be awkward. Managing state transitions by hand will be hard and the PRs will often be in the wrong state. We’ll need to invest in training folks to update them faithfully. Also many contributors may not have the right permissions to change the labels themselves.

(It’s also awkward because are development process operates at the level of commits, not PRs.)

Actions not Labels?

Could we instead use GitHub actions to implement a review process like the one we want? The copy for GitHub actions says:

Make code reviews, branch management, and issue triaging work the way you want.

If we can implement an automated code review management system that matches our workflow, that would be far better than a system of labels. @vsoch I know you’ve looked into GitHub actions; do you think this is possible?

I think for the actions to really work, we would need actions to have a first-class concept of a “commit chain” or (heh) “commit branch”. A “commit chain” is an ordered sequence of commits.

Then, the “Readiness State” from shifts from being a property of the commit to being a property of the “commit chain”. A commit chain is ready if every one of its constituent commits is approved; otherwise, it is not ready.

It would be great if we could organize the UI to group PRs by commit chain, and to have a “merge commit chain” button which automates merging all the constituent commits in the right order.

1 Like

Once every PR in the sequence is approved, we merge them together […] In this example, we need to finish reviewing PR3 before we will merge PR1 or PR2

Hmm, where is this requirement coming from? This isn’t how I usually do it (and, as far as I know, I was the one who popularized this workflow in the SourceCred community in the first place).

In fact, I would say that, as a rule, the opposite is closer to true. One primary purpose of the “small atomic commits” workflow is to avoid the “long-running development branch” effect, in which you are forced to merge large chunks of code at once, which is scary. It should be possible to merge any prefix of the commit sequence once every commit in that prefix is approved.

Something somewhat similar that I do practice: I sometimes wait until I’ve written the initial versions of a whole sequence before sending out any of them for review. I do this when I expect that development of the later commits will influence me to rewrite the earlier commits, creating churn that would be confusing for reviewers. But once they’re sent out, generally speaking, (a) the design will be reasonable, and (b) we know that the end state will be working, so there shouldn’t be a need to block on reviewing all commits before the first can be merged.

Review state

“Needs review”, “Needs update”, and “Approved” sound correct to me.

Readiness state

“Non-ready parents” sounds right to me (you can never merge before your parents are merged, unless you first change the topology so that they’re not your parents anymore). “Non-ready children” doesn’t sound right to me, as discussed above. You can always tweak implementation concerns later. The low-level design should already be reasonable. If there are outstanding questions about the design at a high enough level that the whole commit sequence might really need to be re-thought, then the work is large enough that the design should have been discussed and approved ahead of time.

I similarly disagree with other derived concepts discussed herein, like “commit branches” that need to be merged atomically, for essentially the same reasons. (If unclear, let me know what needs elaboration.)

GitHub actions

I have also looked into these and am happy to consult.

If you want to look into better commit automation, I’d start by looking at something like Bors rather than a GitHub Actions workflow, but both are probably worth considering.

1 Like

I think for the actions to really work, we would need actions to have a first-class concept of a “commit chain” or (heh) “commit branch”. A “commit chain” is an ordered sequence of commits.

Arguably anything that is programmable could be developed into actions. I’m less concerned about the automation here, and more about the humans involved. The commit model detailed here is not something I’ve encountered in other projects. As the number of contributors grows, it’s likely that core developers would follow the rule, but most likely a contributor is not going to read into the details, and will push commits haphazardly. Then will we force them to start again? There is one other community that has very strict rules like this, and although I understand the reasoning, it really pushes people away because it makes it harder to contribute. We arguably want to make it as easy and transparent to contribute as possible.

In an ideal world we could derive readiness in an automated fashion by way of perfectly formatted commits by said pull requests. But in a more realistic one, we can’t insure consistency here because we can’t rely on contributors to be perfect. For this reason, I’m +1 to support the simple solution of using labels. As for which are useful, I think that we can answer this by asking some simple questions.

Who is likely to be browsing pull requests?

Someone that wants to help review or someone that is looking for a feature being worked on. In the latter case, a search for text of interest is probably most useful. But for the first, if someone wants to help review, looking for “Needs Review” would be a salient indicator for them.

How do we represent dependencies?

Given some pull request needs review, the (current) way to say there is a dependency to another PR is just to copy paste the link, and GitHub renders it fully as a reference in both spots. Is that not sufficient enough?

In that the scope of this question is about PR labels, I’m in support. Maybe the other issues (automation and contributing requirements) could be discussed in other threads? Something with GitHub actions could definitely be useful.

1 Like

In my mind, going with this commit model actually represents a tradeoff between the quality-of-life for contributors wearing two different hats: committers and reviewers.

Early in our development cycle, @wchargin and I identified that there’s a tradeoff between code that is easy to write, and code that is easy to review. It’s easiest to write a large feature by creating a giant blob of changes, poking at it until has the desired behavior, and then sending it as a single PR. However, such changes are nearly impossible to review, and tend to either stagnate for long periods of time (bad for everyone) or receive a superficial review and approval-by-exhaustion (also bad for everyone).

Splitting a complex change into semantically atomic commits, each of which are small enough to hold within working memory, is more work for the committer. However, it makes reviewing a breeze, and enables much more thoughtful and high-quality reviews. I also believe that ease-of-review is the best proxy for long-term code readability and maintainability (since code that is easy to review is easy to understand). Therefore, investing in processes that prioritize ease-of-review over ease-of-contributing-code invests significantly in the maintainability of the codebase.

As the leader of SC development efforts, my goal isn’t to maximize the number of contributors, but to maximize the health and long-term success of the SC development community. Often, a smaller group that has tangibly invested in working well together will be more successful than a larger group with lower cohesion. Therefore, I feel confident in standing by our development process as it stands. (Although, in the spirit of “strong opinions weakly held”, I welcome any counter-arguments, or having any flaws in my reasoning pointed out.)

In practice, if someone send in a fundamentally good change which doesn’t meet our development guidelines, asking them to split it into smaller pieces is very different from “forcing them to start again”. Splitting a working change into smaller, atomic changes is generally much less work than building the whole thing from scratch.

That said, we should also invest in making this process as legible, well-supported, and easy to understand as we can. I think labels are a great place to start, and it seems like we have consensus on:

  • Needs Review
  • Needs Update
  • Approved

Shall we go ahead and merge this change to our GItHub organizational process? (I’ll consider this change “approved” if at least one person on this thread approves of the change, and no-one requests changes :wink:)

1 Like

I’d like to request clarification on why “don’t merge any commit until the whole sequence is approved” should be promoted as “official” policy, because (as noted above) I don’t understand that perspective and it seems counterproductive to me.

As it stands, I don’t see any need for additional labels. GitHub pull requests already have an “Approved” indicator, and if you click into the pull request, you can see whether the requested reviewers (if any) have submitted reviews. You can “request re-review” in the GitHub interface, which updates that state appropriately. So the three states of “needs review”, “needs update”, and “approved” are already derivable from the stock UI.

In my experience, labels like these tend to go stale and cause extra confusion when applied manually, so I’m weakly opposed to adding them without automation.

1 Like

Loving the excellent feedback here :smile: thank you.

Readiness State

The contention here is really interesting. The suggestion by @decentralion is to make it a hard requirement to have the whole chain approved. @wchargin the hard opposite of merging prefixes shortly after approving to avoid long-running branches.

In practice I’m currently sitting in between these two. (Though weakly held opinion)

This is something I do pretty much always. I write a local “prototype” which has the finished behavior, to make sure I’m not overlooking some requirement in the early commits.

However, I also hold back merging already-approved prefix commits for a similar reason. Working on a large feature like Initiatives, I’ve had multiple occasions where feedback during the review of PRs much later in the chain caused me to want to revise parts of the prefix with those new insights.

That said, changing requirements are a fact of software development, so maybe I’m being overly cautious there and should embrace stacking “should have been different in hindsight” commits on top of the prefixes.

Labels

Adding labels that overlap with GitHub reviews has a simple reason for me. I think the GitHub review status is too hard to visually scan and would like colors for them. Labels can offer that.

And probably when only making labels for the review state, that could be fully automatic.

But my end-goal for this proposal, is to be able to quickly answer the question: which PRs need attention next? And for that, the readiness state is important as well. So I suggested including labels for especially “Non-ready Children”.

2 posts were split to a new topic: Wish List Categorization

On merging base commits early

Thanks for requesting clarification. I got a little carried away in that post. What I meant to do was describe the state space for PRs. In practice, we have some PRs that I view as “unmerged due to unmerged children”.

I was trying to have a rich enough state space to describe what was currently happening. I did not mean to slip over into mandating that this is how it should happen.

In practice, I prefer your approach: merging incremental commits as they have been approved. Which is to say:

I agree, it’s fine to have “should have been different in hindsight” commits that show up later; arguably many refactors are like this.

So to the extent we are coming up with policy here, do we want a policy that commits are good to merge as soon as they are approved and have no un-approved dependencies? Or do we want to keep giving PR authors responsibility to decide when/how to merge their pulls?


On Labels

I didn’t realize that contributors can re-request approval and that clears the approved state. However, this is also a manual action which we’ll need to train people to do, so the stock UI doesn’t have any big advantages compared to labels.

I agree with @Beanow that adding colored labels will make the state much easier to scan. At a glance, how easy is it to find the build changes in this screenshot? And how easy is it to find which ones are approved and ready? :slight_smile:

If we agree to use labels for review state, it sounds like we could use GitHub actions to build automation to keep the labels up-to-date (set the label when there’s an approving review, reset the label whenever there’s a push, etc).

PR authors should always have the responsibility of deciding when and how to merge their pull requests. There are lots of cases when that is not “immediately after it’s approved”. For instance, I might send out a PR for review before going to lunch, but want to make sure to do some extra end-to-end testing when I get back. A quick approval in the middle shouldn’t force it to be merged.

The stock UI does have a big advantage compared to labels: the “Review requests” dashboard (https://github.com/pulls/review-requested) shows you all pull requests for which you still need to provide a review:

Screenshot of the reviews requested dashboard

This is, of course, properly integrated with the “Request re-review” action.

I check this many times per day (usually, multiple times per hour). Someone on my team at work proposed using this workflow about a year ago, and everyone got used to it in no time. There’s not really anything to learn; it works very well.

It’s a manual action, but it kind of has to be. You generally don’t want to set the state to “needs additional review” whenever additional changes are pushed. In most cases, when you’re ready for re-review you’ll also be submitting a counter-review to respond address comments, so you’re already in the UI and the action is top-of-mind.

I’m not sure why you’d want to find all the pull requests that have an approved review, but if you do, just filter for pull requests with approved reviews (equivalently, search for review:approved):

Screenshot of “approved reviews” dropdown

We surely could, but that’s automation that we have to write and maintain. Because it’s custom, we’ll also lose any familiarity benefits from people who are used to the normal GitHub UI. We’ll have to deal with awkward ACLing issues—external contributors can’t change labels on PRs (which is by design, because labels are often used to bless a PR to run on a trusted CI system), so how is an external contributor supposed to manually change the merge state if (e.g.) they realize that it’s not actually ready for re-review yet? Do we need to write a bot that responds to commands to change the labels? All this complexity really needs a compelling motivation to be worth its weight, and I’ve yet to be convinced that such a motivation exists.

2 Likes

Here’s another example that I just ran into: my PR #1524 (a docs update) is approved, but I’m going to leave it open for a day or so in case @burrrata (optional reviewer) has any comments.

This kind of thing happens all the time.

Also, just an administrative note: Earlier in this thread, two posts were split out into a new topic “A”, but that was subsequently split into another new topic “B”, and topic “A” was since deleted, so the link above in the thread is broken. You can find the moved posts (“B”) here: https://discourse.sourcecred.io/t/wish-list-categorization/564

Looks good to me. Thanks.

Sorry. The original split was sub-optimal for some reason so I split it again. Didn’t know that would break the links.

@wchargin, you’ve made a compelling case for sticking to the stock GitHub UI.

Personally, I wasn’t aware of either the filter for showing just pulls that are requesting my attention, and wasn’t aware that PR authors can trigger a request for re-review. Given that those two features exist, I think the case for building our own labeling system is a lot weaker. So I’m happy to try making better use of the stock UI, and then circling back in 2-4 weeks to see how it’s going.

@Beanow / @vsoch does that sound good to y’all?

Sounds good to me :+1:

(Also completely missed the https://github.com/pulls view)