TIL: git and github diff Differently
My team switched over to the SkullCandy git
workflow
last spring and we did not make a new develop branch for a long time
as deleting the branch on github automatically deletes the branch of
any open pull requests as well.

So, this week we ripped the band-aid off and remastered develop.
It’s been painful.
I was hoping pull request from the develop branch into the master
branch would tell us the commits on develop that are not in
master, so we can sort out the differences.
That pull request did not tell us anything. In fact, it revealed a disturbing fact: changes that I thought were in both branches were not there. How is that so??
I ran experiments to see what’s going on. You can see it here.
Replication
This is what I replicated on the repository, which is the workflow used for SkullCandy:
- start a
masterbranch. - create a
developbranch by cloning themasterbranch. - when starting a new feature, clone off the
developbranch. - when ready to merge change into
develop, make a pull request in. - after change is in
developand validated, cherry-pick the commit from the branch intomasterand make a new pull request. - done.
After experiments in different merge strategies (merge commit, squash
commit, rebase commit), I started to notice: on github, changes that
were on the master branch would ONLY be the same if and only if
the commit SHA for the change matched.
When I checked locally the difference between master and the
corresponding develop and feature branch.
Example: develop3 and master
Let’s go through an example from the repository:
The master branch has all the work and it’s file contents are:
start of work stuff
work stuff 1
work stuff 2
work stuff 3
work stuff 4
work2 changes
more work2 changes
work3 stuff
more work3 stuffThe branch which also has the same work: develop3 has the same file
and its contents are :
start of work stuff
work stuff 1
work stuff 2
work stuff 3
work stuff 4
work2 changes
more work2 changes
work3 stuff
more work3 stuffLocally
Doing a git diff on the command line produces
vagrant@ubuntu-xenial:/vagrant$ git status
On branch master
Your branch is up to date with 'origin/master'.
nothing to commit, working tree clean
vagrant@ubuntu-xenial:/vagrant$ git diff develop3
vagrant@ubuntu-xenial:/vagrant$On github
When making a Pull Request on github.com, the result is:

diff --git a/work_file.txt b/work_file.txt
index bd6764b..9e7d796 100644
--- a/work_file.txt
+++ b/work_file.txt
@@ -5,3 +5,5 @@ work stuff 3
work stuff 4
work2 changes
more work2 changes
+work3 stuff
+more work3 stuffwhich is pretty much as if the work never existed, but is there!
https://github.com/a-leung/commit_tests/compare/master…develop3?expand=1
Why does this matter?
It’s important because there are differences between git and github. I can’t trust github to be consistent with git, even for a simple change if the SHA do not match.
git can resolve the same code appearing with different SHA, github relies on the SHA to compute differences between branches.
The reason for the difference? git computes the difference between branches using diff, github computes the differences between branches using SHA.
The only difference between the branches master and develop3 is
the SHA values for the change:
On master branch:
vagrant@ubuntu-xenial:/vagrant$ git blame -s work_file.txt
fabcea4b 1) start of work stuff
fabcea4b 2) work stuff 1
fabcea4b 3) work stuff 2
c92a36c5 4) work stuff 3
c92a36c5 5) work stuff 4
e492f5f3 6) work2 changes
e492f5f3 7) more work2 changes
de94346e 8) work3 stuff
de94346e 9) more work3 stuffOn develop3 branch:
vagrant@ubuntu-xenial:/vagrant$ git blame -s work_file.txt
fabcea4b 1) start of work stuff
fabcea4b 2) work stuff 1
fabcea4b 3) work stuff 2
c92a36c5 4) work stuff 3
c92a36c5 5) work stuff 4
e492f5f3 6) work2 changes
e492f5f3 7) more work2 changes
88e6fff2 8) work3 stuff
88e6fff2 9) more work3 stuffSo, that’s one area git and github differ!
Lesson Learned
We have to adjust our workflow for the ways git and github treats differences in code. It’s a subtle difference, but with greater consequences in that we cannot use the tooling to help us, which adds work (that is not value add!)
For now, I will be remastering the develop branch with higher
frequency.