Use gitk to understand git – merge and rebase
This is the second part of my Use gitk to understand git post.
In my initial overview, I demonstrated creating a branch, making a couple commits to that branch, and then merging them back into master. In that scenario, there were no changes in my local master (and since it was contrived, I knew there were no changes in the remote origin/master), so the merge was really just a fast-forward. In the real world, my workflow would be slightly different, as I would have to account for other people making changes to our shared repository (my origin remote).
To demonstrate, I’ll rewind time and pretend we’re back at the moment where we switched to master as we prepared to merge in the changes from the issue123 branch. The gitk visualization of the repository looked like:
Before I merge my changes into master, I want to make sure my master branch is in synch with the central repository on github (which I refer to using the remote “origin”). We can see in the screenshot that my master branch refers to the same commit as origin/master, but that’s because I haven’t communicated with origin in a long time. All of my previous operations were done locally. In order to get the latest state from the remote repository, I need to perform a fetch.
d:codegitk-demo>git fetch origin remote: Counting objects: 7, done. remote: Compressing objects: 100% (4/4), done. remote: Total 6 (delta 0), reused 0 (delta 0) Unpacking objects: 100% (6/6), done. From github.com:joshuaflanagan/gitk-demo bf37c64..ec8d10f master -> origin/master
I’ve downloaded new commits to my local repository and moved the remote branch pointer, but I haven’t changed anything in my local branches. If I were to look in my working folder, I would see that none of my files have changed. To get the latest changes to the master branch from Tony, I need to merge them into my master branch.
d:codegitk-demo>git merge origin/master Updating bf37c64..ec8d10f Fast-forward dairy.txt | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) create mode 100644 dairy.txt
x
Once again, since there was a straight line from my local master to origin/master, git was able to perform a fast-forward merge. The master branch has moved to point to Tony’s latest commit. My working directory has been updated accordingly to have the changes he made.
Note that none of the changes I made for issue123 have been included in master yet. We need to merge the issue123 branch back into master, and ultimately push them to the shared repository on github. However, there is no straight line between issue123 and master – neither is a direct descendent of the other – which means we cannot do a fast-forward merge. We have to do either a “real” merge, or rebase.
Merge
To perform a “real” merge, we just use the merge command as we have all along. Doing a fast-forward vs. a real merge is handled by git – not something you specify.
d:codegitk-demo>git merge issue123 Merge made by recursive. fruits.txt | 1 + vegetables.txt | 3 ++- 2 files changed, 3 insertions(+), 1 deletions(-)
Previously with our fast-forward merges, no new commits were created – git just moved branch pointers. In this case, since there is a new snapshot of the repository that never existed before (includes Tony’s new changes, as well as my changes from issue123), a new commit is required. The commit is automatically created with an auto-generated commit message indicating it was a merge. The merge commit has multiple ancestors (indicated by the red line going to the “Forgot the yogurt” commit” and the blue line going to the “Added another fruit” commit). We can safely delete the issue123 branch now, but unlike in the fast-forward example, when we push our changes to the central server, there will be evidence that the issue123 message existed (in the merge commit message, and the repository history shows the branched paths).
d:codegitk-demo>git branch -d issue123 Deleted branch issue123 (was cac3c72). d:codegitk-demo>git push origin master Counting objects: 12, done. Delta compression using up to 2 threads. Compressing objects: 100% (6/6), done. Writing objects: 100% (8/8), 914 bytes, done. Total 8 (delta 0), reused 0 (delta 0) To [email protected]:joshuaflanagan/gitk-demo.git ec8d10f..5835415 master –> master
Rebase
There are a few reasons not to like the merge approach:
- Branching paths in the history can be unnecessarily complicated
- The extra merge commit.
- Your branch is now no longer a private, local concern. Everyone now knows that you worked in an issue123 branch. Why should they care?
Note: There are some scenarios where you want to preserve the fact that work was done in a separate branch. In those cases, the above “downsides” are not really downsides, but the desired behavior. However, in many cases, the merge is only necessary because of the timing of parallel work, and preserving that timeline is not important.
You can use git rebase to avoid these issues. If you have commits that have never been shared with anyone else, you can have git re-write them with a different starting point. If we go back in time to the point right after we merged in Tony’s changes, but before merging in issue123:
Currently, the issue123 commits branch off from the “third commit”. The rest of the world doesn’t need to know that is where we started our work. We can re-write history so that it appears like we started our work from Tony’s latest changes. We want the issue123 commits to branch off from master, the “Forgot the yogurt” commit.
d:codegitk-demo>git checkout issue123 Switched to branch 'issue123' d:codegitk-demo>git rebase master First, rewinding head to replay your work on top of it... Applying: My first commit Applying: Added another fruit
After a rebase, the “My first commit” now directly follows the “Forgot the yogurt”” commit, making the issue123 branch a direct descendent of the master branch. This means we can now do a fast-forward merge to bring issue123’s changes into master.
d:codegitk-demo>git checkout master Switched to branch 'master' d:codegitk-demo>git merge issue123 Updating ec8d10f..b5a86d6 Fast-forward fruits.txt | 1 + vegetables.txt | 3 ++- 2 files changed, 3 insertions(+), 1 deletions(-)
When we delete the issue123 branch and push these changes to the remote repository on github, there is no longer any evidence that the issue123 branch ever existed. Anyone that pulls down the repository will see a completely linear history, making it easier to understand.
d:codegitk-demo>git branch -d issue123 Deleted branch issue123 (was b5a86d6). d:codegitk-demo>git push origin master Counting objects: 9, done. Delta compression using up to 2 threads. Compressing objects: 100% (4/4), done. Writing objects: 100% (6/6), 626 bytes, done. Total 6 (delta 1), reused 0 (delta 0) To [email protected]:joshuaflanagan/gitk-demo.git ec8d10f..b5a86d6 master –> master