Fork off with your branches

The nice thing about git is that branches are cheap and you can create many and varied branches for every little fix, format or feature. The annoying thing about GitHub is that every user creates many many many branches.

One of my pet peeves at the moment is a large number of branches on our upstream repository. That is, the shared common one we all fork from and Pull Request into. When I started there were almost 200 branches! So many branches made it really hard to find out which branch was for what, and even harder to tell if any of the branches were not being used.

These might seem like small things, but if you have a CI server (I really hope you have a CI server) you might set up wildcard builds that automatically build every branch that matches a certain pattern, for example, feature/* or release/*. Now every one of these branches has a build that runs every time the code changes. You probably want to ensure they are up to date with master, so a script will run on a schedule (or triggered off of a merge) and auto-merge master down to your branch. (Why is this required? probably because you have so many branches in the first place! Investigate git rebase…)

So now, every time the script runs, it triggers updates on every branch, which triggers a build. 200 branches? instant 200 builds in your CI queue. (Who has 200 agents? not me.) Frustration sets in when you can’t get your normal build through because your server is busy, and will stay busy for a while.

The other obvious overhead to me is a cognitive one. If an individual is tracking more than one branch or work, they have to context switch back and forward between then. I can understand 2 work streams due to external blockers on task one and be filling time with task 2, but having 3 or more branches is juggling too much work. Taking this further, if you have a bunch of branches because the work is done and ready to release, get it shipped. It’s not done until it’s in production in front of customers (even if only early adopter or beta users). using branches as a release task list isn’t sensible when everyone does it. Your branches get lost in the noise. You may even forget to release a branch of changes until your manager chases you for something you thought was released!

Let’s apply this to teams. A team can only be focussing on so many priorities (read branches) before they start dropping balls, or wasting a lot of time on communication and overhead on tracking these different tasks (bugs, features etc). The less active branches a team has to manage at once, the better. And you get a bit of a bystander effect on your branches.

Frustrations aside, there are some real benefits from using a git flow or GitHub Flow approach where there is a single trunk of development (either a develop-release-master flow or a master-master-master flow) where you are subject to a continuous integration model. Things like feature flagging and dark releases are great ways to handle harder features and code refactors such are renames, formatting and class extraction are much easier when there is only one branch to work in.

Some rules to work by:

keep your small change branches in your fork.
Don’t push branches upstream
collaborative feature branches? Use feature toggles to maintain continuous integrations.
Refactoring? incrementally, ship often.
No more big bang changes, please.
Branches are cheap, but you don’t have to share them.

Another pet peeve is naming things. If you have to have a branch, name it after your team, your username, or the bug tracking id for the story of work it relates to. cryptic ‘feature’ or ‘bug’ description names don’t help anyone. The diff says what the change is, so it is self-descriptive (aka self-documenting), but I don’t know who owns it.

</rant>