GIT Setup for Decentralized Collaboration

This guide describes how to set up the GIT version control system for repository-based decentralized collaboration. This deviates from the patch-based collaboration common in loosely connected teams (one example is the git development process itself). There, changes are sent in the form of patches via e-mail (often to a mailing list) and then integrated into a repository by its maintainer.

The patch-based approach has the advantage that patches are often read by more than one pair of eyes before being applied, thus increasing the chances to catch bugs early. However, for a small development team which can communicate face-to-face, the repository-based workflow might come with less overhead. In any case, it is easy to switch to patch-based collaboration if there is need.

Prerequisites

We assume git is installed at least in version 1.6.0. The commands described here were tested with git version 1.6.0.2. Basic familiarity with git is assumed. For further details on the commands and their meaning we refer to the git documentation.

Setting up a Public FMT Repository

We are setting up project foo with a repository from which others can pull changes. For public HTTP access, a good choice to store the repository is under ~/public_html/projects/foo.git/. For collaboration via SSH only, ~/projects/foo.git/ is sufficient, provided that others have read access to that directory.

# In the FMT home directory
$ cd ~/projects/
$ mkdir foo.git
$ cd foo.git
$ git --bare init

For HTTP, SSH, and other "dumb" servers, in addition the following has to be done:

# In the public repository (~/projects/foo.git/)
$ mv hooks/post-update.sample hooks/post-update   # only newer versions of GIT
$ chmod a+x hooks/post-update

This is likely the last time that something has to be changed manually in this directory.

Setting up a Working Repository

It is usually not a good idea to directly work in the repository from which others are going to pull changes. They would see all our work-in-progress branches. Also, we might be caught in the middle of a refactoring, leaving the code in an inconsistent state.

Thus, we use a working repository, from which we selectively push changes to the public repository when we deem them good enough for public consumption.

There are two options to create a working repository, depending on whether the public repository has been populated before. We first describe the steps where this is the case.

Cloning a Working Repository from a Public Repository

We can obtain a working repository from a public repository with the following command:

# clones public repository into directory foo-devel/
$ git clone ssh://host.ewi.utwente.nl/~fred/projects/foo.git/ foo-devel/

Later, we might want to pull more branches into our work repository. We can proceed just like if we were pulling changes from others into an existing repository. Since our branch comes from origin (what we cloned from), we don't need to configure any other remotes:

# Checkout branch refs/remotes/origin/release as branch release
$ git checkout -b release remotes/origin/release
Branch release set up to track remote branch refs/remotes/origin/release.
Switched to a new branch "release"

Populating a Public Repository from a Working Repository

In case the public repository is still unpopulated (as per the section on setting it up), we want to set up a working repository and populate the public repository from there. The steps necessary to set up a working repository and populated it have been described in detail elsewhere. After adding files and committing the changes to the working repository, we can proceed with publishing them to the public repository.

Publishing Changes

We push changes from the working repository to the public repository.

Setup

Unless we have cloned the repository we need to tell git where to push changes to:

# In the working repository
$ git remote add origin ssh://host.ewi.utwente.nl/~me/projects/foo.git/
$ git push origin +master                    # push branch master to origin

Regular Publishing

$ git push -v

Pulling Changes from Others into an Existing Repository

Setup

# In the working repository
$ git remote add fred ssh://host.ewi.utwente.nl/~fred/projects/foo.git/
# See what is available
$ git remote show fred
* remote fred
  URL: ssh://host.ewi.utwente.nl/~fred/projects/foo.git/
  New remote branch (next fetch will store in remotes/fred)
    master

# Fetch changes
$ git fetch fred

# Checkout branch refs/remotes/fred/master as branch fred
$ git checkout -b fred remotes/fred/master
Branch fred set up to track remote branch refs/remotes/fred/master.
Switched to a new branch "fred"

Getting Up-To-Date

# In the working repository, on branch "fred"
$ git fetch -v

Update everything from remote fred:

# In the working repository, on any branch
$ git fetch fred

Note that git fetch only changes remote branches. However, the decision which remote to fetch from can depend on the current branch.

Merging Changes

# In the working repository
$ git merge fred            # merge changes of branch fred into current branch

Getting Up-To-Date and Merging Changes in One Go

We fetch and merge changes from refs/remotes/fred/master into (local) branch fred and current branch foo:

# In the working repository, on branch foo
$ git pull fred master:fred

Another way to fetch changes and merge them into branch fred:

# In the working repository, on branch "fred"
$ git pull -v

Note that this works with branch fred only because it is tracking refs/remotes/fred/master. For other branches some more configuration is needed.

Automating Frequent Merges to the same Branch

When frequently merging changes from others to our local working branch, git can be configured to save the extra git fetch fred step:

# In the working repository
$ git config branch.work.remote fred
$ git config branch.work.merge refs/heads/master

Then, all that is needed to get up-to-date:

# In the working repository, on branch "work"
# Fetch & merge changes from refs/remotes/fred/master into branch "work"
$ git pull -v

Dealing with Conflicts

When merging (via git merge or git pull), sometimes conflicts arise. Both, git merge and git pull continue merging even if there are conflicts. They will leave conflict markers around the conflicting regions of file, similar to other version control systems.

Redo from Start

To undo a merge and get back to the previous state of the branch:

# In the working repository
$ git reset --hard HEAD

Manually Resolve a Conflict

A conflict can be resolved by removing the conflict markers and cleaning up the situation manually. Then the changed files must be refreshed:

# In the working repository, after cleaning up conflicts in file.txt
$ git add file.txt

Alternatively, we can refresh all touched files at once:

# In the working repository
$ git add -u

Finally, we commit the conflict resolution:

# In the working repository
$ git commit

Resolve a Conflict with Tool Support

# In the working repository
$ git mergetool

Branch Management

Naming

Because branching in git is cheap, it is common to have several branches around. Branches can have different purposes, and for lack of other means this is best indicated by a naming convention, along with some usage guidelines.

Long-Lived Branches

Branches like master, next, etc., are meant for others to pull from. They exist for a long time, and their content changes over time. The history of long-lived branches should be stable. I.e., once changes for a long-lived branch are pushed out into a public repository, they should not be rebased any more (local copies of the branch can be rebased with care, though). The reason is that rewriting the history "pulls the rug from under other developers' branches". Instead, the use of git merge is preferred.

Topic Branches

Ideally, each self-contained feature is developed on its own appropriately named branch. When it is reasonably stable, it can be merged into one of the long-lived branches.

In order to distinguish topic branches from long-lived branches, they can be marked as "work-in-progress", e.g., wip/shiny-feature.

Is is common that topic branches are rebased on the current HEAD revision frequently. The wip/ prefix indicates that it is likely not a good idea to branch off them, because their history might change completely at any time.

Squashing a Branch

If the history of a (topic) branch is not important (e.g., because it contains many trial-and-error commits whose details are irrelevant), it can be squashed (with git merge --squash) into a topic branch: all commits of the topic branch are aggregated into a single commit on the target branch.

Deleting a Local Branch

A local branch can be deleted with:

# Delete branch wip/test
$ git branch -d wip/test

Alternatively, if git refuses and you are really sure that branch can go:

# Delete branch wip/test
$ git branch -D wip/test

Deleting a Remote Branch

A remote branch can be deleted with:

# Delete branch origin/wip/test
$ git push origin :wip/test

To get rid of fetched branches which were deleted on the remote side:

# Remove deleted branches from remote fred
$ git remote prune fred

Sending Patches via Email

An alternative to publishing branches is to send a patch via email. This might have less overhead for small, self-contained changes.

Generating Patchsets

Sendable patches can be generated with, e.g.:

# writes the last (N) commits on the current HEAD revision to files 0001-..., 0002-...
git format-patch -k -s -1
# writes everything between "name" and the current HEAD revision into file foo-patchset.patch
git format-patch -k -s --stdout name > foo-patchset.patch

Emails can be sent with any email program which does not mangle text attachments, or with git send-email files.... (Note that git send-email can be configured via .git/config or ~/.gitconfig to accommodate almost all imaginable needs.)

Applying Patchsets

# applies patchset to current branch
git am -k foo.patch

In case of a conflict, we can roll back the changes, e.g.:

git am --abort

See also the git am --interactive option.

Release Management

Git branches can be used to make software releases from git easier and more predictable. The branching setup described here is essentially the scheme used by the git development team, as outlined by Junio Hamano.

master
The branch named "master" tracks the current releasable version. It should at any time be possible to release a new version of the software from the master branch, hence what goes in there should be stable new features (merged from "topic branches", see below).
maint
The "maint" branch is used to collect maintenance patches: bug fixes, documentation updates, etc.. maint will be merged into master regularly. You can either tell which of you published branches to merge1 or cherry-pick such patches from, or just send them to a mailing list. (See Section "Emailing Patches" for instructions.) It should at any time be possible to make a "point release" (bug fixes) from maint.
next
This is where new features go, which are reasonably well developed but not completely finished (perhaps documentation missing, etc). next is advanced by merging in "topic branches", possibly several times. master is merged into next regularly to keep up with the maintenance changes. It is done this way because git is smart enough to skip patches which are both in master and next during these merges.
"topic branches"
New features are developed on "topic branches", until they are considered reasonably stable. Each topic branch has an owner (usually the person which started the branch). The owner is responsible for keeping the branch organized, and up-to-date wrt. current developments. This means that a topic branch should be rebased onto the current "next" branch from time to time, but at least before it will be considered for inclusion into next. After a topic branch was included into next it should not be reorganized/rebased anymore beyond the published point.

1. if a branch is forked from maint