This post originated from an RSS feed registered with Agile Buzz
by Travis Swicegood.
Original Post: Workflow with Git
Feed Title: Travis Swicegood
Feed URL: http://travisswicegood.com/atom/
Feed Description: Posts on Git from Travis Swicegood, author of Pragmatic Version Control using Git.
I’ve been toying with my Git workflow the past year at Continuum and have
come up with a good workflow for handling semantically versioned software
inside Git. This post is my attempt to catalog what I’m doing.
Here’s the TL;DR version:
master is always releases that are tagged
Code gets merged back in to develop before master, all work happens in
feature branches off of develop
Bug fixes are handled in branches created from tags and merged directly back
in to master, then master is merged to develop.
That’s the high level overview. Below is that information in more depth.
master of code
The master branch always contains the latest released code. At any time,
you can checkout that branch, build it, install it, and know that it was the
same code you would have gotten had you installed it via npm, pypi, or conda.
Merges into master are always done with --no-ff and --no-commit. The
--no-ff ensures a merge commit so you can revert the commit if you ever need
to. Using --no-commit gives you a chance to adjust the version numbers in
the appropriate meta data files (conda recipe, setup.py, package.json, and
so on) to reflect the new version before committing. For most of my commit
releases, I’m simply removing the alpha suffix from the version number.
There should only be one commit in the repository for any given version
number and every commit that’s in master is considered to be released. Keep
in mind, that means you can’t use GitHub’s built-in Merge Pull Request
functionality for releases, but that’s ok by me. You have to go to the command
line to tag anyhow.
With the appropriate changes for versions, the next step is to create the
commit and then tag it as vX.Y.Z immediately. From there, you build the
packages and upload them or kick off your deployment tools and the code with
the new version is distributed.
Managing Development with develop
Now you need to start working on the next feature release. All work happens in
the develop branch and it should have a new version number. The first thing
you should do is merge master in, then bump the version number to the next
minor release with a suffix of some sort. I use alpha, but you can change
that as needed depending on your language / tools.
For example, I just released v0.8.0 of an internal tool for testing yesterday
(no, it’s not being used in production yet, thus the 0 major version).
Immediately after tagging the new version, I checked out develop, merged
master into it via a fast-forward merge, then bumped the version number to
v0.9.0alpha. Now, every commit from that point forward will be the next
version with the alpha suffix so I can immediately see that it was built from
the repository.
Managing Branches
Everything is developed in branches. New features, refactoring, code cleanup,
and so on happens off of the develop branch, bug fixes happen in branches
created directly from the tagged release that the bug fix is being applied to.
Let’s deal with feature branches first, they’re more fun.
I’ve gotten into the happen of adding prefixes to my branch names. New
features have feature/ tacked on at the start, refactor/ is used whenever
the branch is solely based on refactoring code, and fix/ is used when I’m
fixing something. The prefixes provide a couple of benefits:
They communicate the intent of the branch to other developers. Reviewing a
new feature requires a slightly different mindset than reviewing a set of
changes meant solely to refactor code.
They help sort branches. With enough people working on a code base, we’ll
end up with a bunch of different types of changes in-flight at any given time.
Having prefixes lets me quickly sort what’s happening, where it’s happening,
and prioritize what I should be looking at. I generally don’t want any fix/
branches sitting around for very long.
Some people like having the developer name in the branch as well to provide a
namespace. I can understand this approach, but I think its wrong. First, Git
is distributed, so if you truly need a namespace for your code to live where it
doesn’t interact with other’s code, create a new repository (or fork if you’re
on GitHub).
The second, and much more important, reason I don’t like using names in
branches is that they promote code ownership. I’m all for taking ownership of
the codebase and particularly your changes. It’s part of a being a
professional: own up to the code you created and all its flaws. What I’m not
for is fiefdoms in a codebase.
I worked at one company where I found a bug in the database interaction from
the calendar module. I fixed the bug in MySQL, but didn’t have the know-how to
fix the bug in the other databases. I talked to the engineering manager and
was directed to the developer that owned the calendar. I explained the bug, my
fix, and what I thought was needed for the other databases to work and they
were to fix it. When I left the company six months later, my fix still wasn’t
applied and none of the other databases had been fixed. All because the person
who owned the calendar code didn’t bother to follow through.
Having a branch called tswicegood/fix/new-calendar-query gives the impression
that I now own the new calendar fix. Removing the signature from that is a
small step toward increasing the team ownership of a code base and removing the
temptation to think of that feature as your own.
Managing Bugfixes
So what about bugs? You want the bug fix to originate as close to the
originally release code as possible. To do this, create the branch directly
from the tag, bump the version number, then work on your fix. For example,
let’s say you need to find a bug in v1.2.0 that you need to fix.
12
$ git checkout -b v1.2.1-prep v1.2.0
... adjust version number to v1.2.1alpha, then commit
The -b v1.2.1-prep tells Git to create a branch with that name, then check it
out. The v1.2.0 at the end tells Git to use that as the starting point for
the branch. The next commit adjusts the version number so anything you build
from this branch is going to be the alpha version of the bug fix. With that
bookkeeping out of the way, you’re ready to fix the code.
For projects that have a robust test suite (which unfortunately isn’t all of
them, even mine), the very next commit should be a failing test case by itself.
Even when you know the fix to make the test pass, you should create this commit
so there’s a single point in the history that you and other developers can
check out and run the tests to see the failure. The next commit then shows the
actual code that makes the test pass again.
Once the fix has been tested and is ready for release it’s time to merge back
in to master. You should do this with --no-ff, and --no-commit and
remove the alpha suffix before committing just like making a feature release.
Once you’ve merged and tagged the code, you need to get develop up-to-date
with the bug fix. Since master and develop have now diverged — remember,
develop has at least one commit bumping the version number — you have to deal
with a merge conflict.
Hopefully, the merge conflict is limited to the version number. If that’s the
case, you can just tell git merge to ignore those changes by with this
command:
12
$ git checkout develop
$ git merge -X ours master
The -X command tells git merge which strategy option to use when merging,
and using ours tells it that the code in the branch you’re merging into wins.
You need to be careful with this, however. It means that any real conflicts
would be swallowed up. Hopefully you know the changes well enough to realize
if there’s a larger conflict, but if for some reason you don’t know, you can
always try this approach:
12345
$ git merge master
… ensure that the only conflicts are around the version … numbers and that the develop branch code should be used$ git reset --hard ORIG_HEAD
$ git merge -X ours master
You’ll have to manage any merge conflicts manually (or use git mergetool) if
the conflicts are larger than the version number change. If you do confirm
that you don’t need any of the conflicted changes, you can use git reset
--hard ORIG_HEAD to reset the working tree back to its pre-merge state, then
the git merge -X ours master to pull the changes in ignoring the conflicts
from master.
On develop versus master
I’ve gone back and forth on this. My preference is to release often.
Sometimes multiple times a day. In that case, master is just a quick staging
around. Create a branch, bump the version, write one feature, merge it, bump
the version number, rinse, then repeat.
There are a few problems with this approach. First, not every team or for that
matter project can work that way. Sometimes the code needs more testing across
multiple platforms or configurations. Sometime’s there’s an integration test
suite that takes awhile to run. Sometimes releases need to be timed to
coincide with scheduled downtime giving you time to implement a few features
while waiting for your release window.
Second, it doesn’t scale. One branch that merges one feature is fine, but if
you have a team of developers working on a project you probably have multiple
things being worked on in parallel. Having them all branch off master, all
bump their version number, and all coordinate for an octopus merge (or merge
and release separately) is a nightmare.
Having everyone branch and merge off of develop provides a base that keeps in
sync with the rest of your code base. Your feature branch exists by itself,
and all it needs to do to stay in sync is occasionally merge develop.
Compared to git-flow
This is very similar to the workflow called git-flow. There are a few
differences.
If my memory serves, it used to call for branch names with the author’s name
in it (a re-reading of it now doesn’t show that though). That’s what remote
repositories are for, so I don’t want to use that.
Correction, nvie just confirmed that it’s never been there, so one of my
biggest gripes with it wasn’t founded. Oops. :-/
Next, hot fixes or bug fixes in git-flow are merged to master and develop
instead of only master. I want the versions going through master then back
out to develop. To me, it’s a cleaner conceptual model.
Versions, a thing I’ve written about, are important. I want
develop to be installable, but I don’t want it confused with any released
version. There should only be one commit, a tagged commit at that, in each
repository that can be built for any given version.
I don’t call out release branches in my description because my hope is that
they aren’t necessary. Of course, if your project has a long QA cycle that’s
independent of development or you’re trying to chase down a stray bug or two
before a release, then a release branch is great, I just don’t make them
required.
In Closing
The most important thing is to create some process to how code moves through
your repository, document it, and stick to it. Everyone always committing
directly to master is not sustainable. It also makes it much harder to
revert changes if something makes it in by accident as you have to go find all
the relevant commits instead of reverting one merge commit.
Worst than a free-for-all in master is the hybrid. Committing some of the
time directly to master and other times to a feature branch means there’s no
pattern to how your code is used. What’s the threshold for creating a feature
branch? Is it based on how big the feature is, or how long it’s going to take?
Answering these questions distracts you and future contributors. Providing a
solid pattern of how contributions flow through your repository is an important
step in making your project more accessible to fellow contributors regardless
of whether those are in the open-source community or an office down the hall.
Some of the things outlined here might seem like a lot of overhead, but in the
end they save you time. Most importantly, they’ll scale beyond just you.