Way back in 2013, I described Stitch Fix’s git flow, as a reaction to the popular post “A Successful Git Branching Model”. Recently, Jussi Judin posted a reaction using the classic considered harmful tag line. I read the post with enthusiasm, but found the process described still too complex. This felt like a good time to refresh my previous post with how Stitch Fix currently does branching and Git (it’s even simpler than it was in 2013).
The Elements of Successful Configuration Management
First, I want to outline why things like Git Flow and Jussi’s proposal are “too complex”. Both suffer from process geared toward some sort of abstract “purity” of the release history and less about making it easy for developers to simply do the right thing. If you’ve ever worked on a development team using Subversion, you’ve seen an extreme view of this.
It’s difficult to get Subversion to do branching and to manage those branches. Often, it requires a dedicated release engineer (or release team!) to keep it straight, because even when developers can manage the configuration management, it’s often difficult to follow and easy to screw up.
The issues with Git Flow are a bit more obvious, in that it has many special branches and tags that all work toward managing a fairly heavyweight release process. The benefits of such a process are few, especially if there are no external factors motivating them.
The main issue with Jussi’s process, though, is that it makes it pretty easy to remove code from production inadvertently. In their process, bugfixes
are merged to and from master, but
master is just for development. Whatever’s on production is based around tags and special branches. In effect,
developers have to remember to cherry-pick commits across branches so nothing gets lost.
While competent developers are capable of this, it’s too easy to forget, and there’s no obvious record of what is supposed to be on production. Git
already can handle all this for you, by just creating feature branches and merging them to
master, which is what we do.
The Simplest Thing That Could Possibly Work
At Stitch Fix, our “Git Flow” is dead simple:
masteris what’s on production (or headed for production). When tests pass on
master, the code is deployed.
- All new code goes onto a feature branch for review using a GitHub Pull Request.
- The author(s) merge to master when:
- CI passes
- They feel they’ve gotten enough feedback
- They’ve arranged with their users or business partners that the change is ready
- For long-running branches (which we try very, very hard to avoid), the branch author(s) merge
masterinto their branch as necessary. In practice, we try to ship small things frequently, so this doesn’t create many merge conflicts.
- For emergency bugfixes, still make a PR, but “instamerge” it. This notifies everyone that the change was made, but still gets the fix up quickly. The
@fixbot deploy «app» to productionin Slack to skip the CI-based deploy if it’s super urgent.
That’s it. It’s so simple that everyone pretty much follows it without any real instruction. If you are interested in work happening in a repo, subscribe to it, and GitHub notifies you of PRs.
We decided on merging (instead of rebasing) because that felt more honest about the actual activity happening in the repo, but I could see this flow
working almost as well by rebasing each branch before merge, so that
master as a linear history. Of course, this is an extra step that people must
remember and that someone has to enforce, so I would be very honest with yourself about the problem you are trying to solve by using rebasing over merging.
Since we’re talking about honesty, I will admit that the above flow may not work for everyone.
The Process Fits the Team
At Stitch Fix, our company values are very real and we take them seriously—it’s a big part of our hiring process that, among other things, allows us to work the way I’ve described above.
- Developers are responsible for their code’s behavior in production.
- We have a strong test culture and no manual QA (outside of our iOS app).
- We favor many small applications and services over fewer monolithic applications. The churn in any given repo is much lower using the former system architecture than the latter.
- We value (and interview for) responsibility and integrity. This means that while everyone is trusted to ship code at the right time, it’s OK if people make mistakes, provided they admit to them and let everyone else learn via post-mortems (I just don’t see how any company can value integrity without giving everyone “room to fail” like this).
If your team doesn’t value these things, or has to work under other constraints that make releases difficult and in frequent, the above system might not work.