Changing history with Git Rebase

I’m a huge fan of using Git rebase to clean up the development I do in SourceCred. I think this article has some pretty good materials on getting comfortable with rebasing, I recommend checking it out if you plan to commit to sourcecred/sourcecred.

https://git-rebase.io/

Hey, so I’m fairly new to Git but did the Udacity course previously on Kays suggestion which got me to a place where I could happily make PR’s on the Giveth wiki. I’m quite curious about this rebase feature having skimmed the link.

Whats the score there? By which I mean what are the use cases? - I gather its for cleaning up commit history when you don’t make periodic commits or need to add extra context to a commit? In SourceCreds case I can see the value in having a well cultivated graph of commits for sure, seems like a useful pattern so I appreciate these new tools. Thanks for sharing

I use git rebase all the time.

I like to keep an orderly, well-scoped sequence of commits as I work, where each commit has a very specific scope or change that it makes.

To use an example from my current hacking on timeline cred: I have a commit to improve PageRank performance, a commit that changes how the timeline seed vectors are computed, a commit that changes how the timeline charts are rendered, and so forth. The result is that it’s very easy to pull pieces out of my history and merge them separately. For example, the commit improving PageRank performance is not specific to timeline cred at all–I could merge it separately, thus enabling higher quality / clarity code review on that change in isolation, and reducing the review burden on someone reviewing my timeline work. (I review all my own code, so this saves me a lot of time and mental effort :slight_smile:.)

Suppose that I want to merge that PageRank performance commit separately. I’ll make a new branch off of origin/master:

$ git fetch
$ git checkout origin/master
$ git checkout -b pagerank-performance

Then I’ll cherry-pick the commit I want onto that branch. (A cherry pick is kind of the simplest posisble rebase. A rebase moves a sequence of commits on top of another branch. A cherry pick moves a single commit.)

$ git cherry-pick 1234abcd....

Now I can make a pull request from this branch, which only contains a single commit and is easy to review.

Suppose that I make a few changes based on review feedback, and now go back to my timeline cred branch. Now I have a problem: my timeline cred branch has a different version of the PageRank performance commit than origin/master does. I can use git rebase to fix this.

$ git checkout timeline-cred
$ git fetch
$ git rebase -i origin/master

This will give me the ability to interactively rebase all my timeline cred commits on top of origin/master. Specifically, I can throw away my old (outdated) version of the PageRank performance commit, as I can now depend on the (reviewed and canonical) version on origin/master.

Fun fact: I never use git pull. git pull creates merge commits, which I find difficult to reason about. Instead, I use rebasing.

The trouble with rebasing is that you can diverge from what the upstream (GitHub) remembers you having. Because you aren’t adding commits, you’re changing existing commits. So if you want to push after rebasing, you can’t just use git push. You need git push --force-with-lease. (--force would work too, but --force-with-lease adds a bit of safety.)

1 Like

Thanks for taking the time to write this response, its helpful and a whole feature set to Git that I’ve never heard of. The gist of what you’re talking about computes to degrees but I’d ideally put it into practice somehow to ingrain it to habitual memory.

So practice of scoping commits is encouraged on the Udacity course, however since I’m just adding wiki notes most of the time my scope is pretty much limited to adding them with Atom and changing a YAML file pointer. Since I consider it the same action I block it as one commit, though I used to do two. Doing more editing work I’d probably commit in page or paragraph chunks depending on the level of detail.

My exploration of Git in that regard is thus pretty limited at the moment, do you have any recommendations for ingraining such a workflow as one progresses (ideally from my current context)? I’m assuming there is probably a level of code competency required before trying to adopt more complex workflows like this; so at what stage of learning to code should one start to adopt these practices and how? (i.e. at what point does rebase become relevant to me if all I currently do is commit notes, with next steps trajectory of fluency in Neo4J).

Thanks for the time :slight_smile:

All you need is git config --global pull.rebase true.

Good question. Code competency is not a strict requirement, imho. The better proxy is experience with Git itself. You can understand it conceptually and still not totally grok it until you’ve used it every day for months and figured out how you prefer to jump over all the walls that you inevitably run into.

To some degree (again imho), you should seek out more complex workflows only to address specific pain points in your current workflows. But this doesn’t quite paint the whole picture, because if you don’t know what’s behind your horizon then you may not even realize that this little corner of your life could be an order of magnitude easier (cf. Blub programmers).

As one example: If you work on medium-to-large projects with teams of people, you may find that you want to create a sequence of changes/pull requests, each of which depends on the previous one. You could send them out one at a time, waiting for the review-revise-merge cycle before sending the next one, but this really wastes a lot of time and decreases the available context. How should you modify your workflow to better handle such dependent pull requests?

(This one is actually kind of a hard one. I’ve been honing my solution to this for a few years, and I still haven’t completely automated the latest formulation!)