Chapter 6. Advanced Branching Techniques

The previous chapter, Collaborative Development with Git, described how to arrange teamwork, focusing on repository-level interactions. In that chapter, you learned about various centralized and distributed workflows, and their advantages and disadvantages.

This chapter will go deeper into the details of collaboration in a distributed development. It would explore the relations between local branches and branches in remote repositories. It will introduce the concept of remote tracking branches, branch tracking, and upstream. This chapter will also teach us how to specify the synchronization of branches between repositories, using refspecs and push modes.

You will also learn branching techniques: how branches can be used to prepare new releases and to fix bugs. You will learn how to use branches in such way so that it makes it easy to select which features go into the next version of the project.

In this chapter, we will cover the following topics:

  • Different kinds of branches, both long-lived and short-lived, and their purpose
  • Various branching models, including topic branch-based workflow
  • Release engineering for different branching models
  • Using branches to fix a security issue in more than one released version
  • Remote-tracking branches and refspecs, the default remote configuration
  • Rules for fetching and pushing branches and tags
  • Selecting a push mode to fit chosen collaboration workflow

Types and purposes of branches

A branch in a version control system is a active parallel line of development. They are useful, as we will see, to isolate and separate different types of work. For example, branches can be used to prevent your current work on a feature in progress from interfering with the management of bug fixes.

A single Git repository can have an arbitrary number of branches. Moreover, with a distributed version control system, such as Git, there could be many repositories (forks) for a single project, some public and some private; each repository will have its own local branches.

Before examining how the collaboration between repositories looks like at the branch level, we need to know what types of branches we would encounter in local and remote repositories. Let's now talk about how these branches are used and examine why people would want to use multiple branches in a single repository.

Note

A bit of history: a note on the evolution of branch management

Early distributed version control systems used one branch per repository model. Both Bazaar (then Bazaar-NG) and Mercurial documentation, at the time when they begin their existence, recommended to clone the repository to create a new branch.

Git, on the other hand, had good support for multiple branches in a single repository almost from the start. However, at the beginning, it was assumed that there would be one central multibranch repository interacting with many single-branch repositories (see, for example, the legacy .git/branches directory to specify URLs and fetch branches, described in the gitrepository-layout(7) man page), though with Git it was more about defaults than capabilities.

Because branching is cheap in Git (and merging is easy), and collaboration is quite flexible, people started using branches more and more, even for solitary work. This led to the wide use of extremely useful topic branch workflow.

There are many reasons for keeping a separate line of development, thus there are many kinds of branches. Different types of branches have different purposes. Some branches are long-lived or even permanent, while some branches are short-lived and expected to be deleted after their usefulness ends. Some branches are intended for publishing, some are not.

Long-running, perpetual branches

Long-lived or permanent branches are intended to last (indefinitely or, at least, for a very long time).

From the collaboration point of view, a long-lived branch can be expected to be there when you are next updating data or publishing changes. This means that one can safely start their own work basing it on (forking it from) any of the long-lived branches in the remote repository, and be assured that there should be no problems with integrating that work.

Also, what you can find in public repositories are usually only long-lived branches. In most cases, these branches should never rewind (the new version is always a descendant of the old versions). There are some special cases here though; there can be branches that are rebuilt after each new release (requiring forced fetch at that time), and there can be branches that do not fast forward. Each such case should be explicitly mentioned in the developer documentation to help avoiding unpleasant surprises.

Integration, graduation, or progressive-stability branches

One of the uses of branches is to separate ongoing development (which can include temporarily some unstable code) from maintenance work (where you are accepting only bug fixes). There are usually a few of such branches. The intent of each of these branches is to integrate the development work of the respective degree of stability, from maintenance work, through stable, to unstable or development work.

Integration, graduation, or progressive-stability branches

Fig 1. A linear view and a "silo" view of the progressive-stability branches. In the linear view, the stable revisions are further down the line in your commit history, and the cutting-edge unstable work is further up the history. Alternatively, we can think of branches as work silos, where work goes depending on the level of the stability (graduation) of changes.

These branches form a hierarchy with a decreasing level of graduation or stability of work, as shown in Fig 1. Note that, in real development, progressive-stability branches would not keep this simple image exactly as it is shown. There would be new revisions on the branches after the forking points. Nevertheless, the overall shape would be kept the same, even in the presence of merging.

The rule is to always merge more stable branches into less stable ones, that is, merge upwards, which will preserve the overall shape of branch silos (see also Fig 2 in the Graduation, or progressive-stability branches workflow section of this chapter). This is because merging means including all the changes from the merged branch. Therefore, merging a less stable branch into a more stable one would bring unstable work to the stable branch, violating the purpose and the contract of a stable branch.

Often, we see the graduation branches of the following levels of stability:

  • maint or maintenance of the fixes branch, containing only bug fixes to the last major release; minor releases are done with the help of this branch.
  • The master or trunk, or stable branch, with the development intended for the next major release; the tip of this branch should be always in the production-ready state.
  • next or devel, development, or unstable, where the new development goes to test whether it is ready for the next release; the tip can be used for nightly builds.
  • pu or proposed for the proposed updates, which is the integration testing branch meant for checking compatibility between different new features.

Having multiple long-running branches is not necessary, but it's often helpful, especially in very large or complex projects. Often in operations, each of levels of stability corresponds to its own platform or deployment environment; giving a branch per platform.

Per-release branches and per-release maintenance

Preparing for the new release of a project can be a lengthy and involved process. Per-release branches can help with this. The release branch is meant for separating the ongoing development from preparing the new release. It allows other developers to continue working on writing new features and on integration testing, while the quality assurance team with the help of the release manager takes time to test and stabilize the release candidate.

After creating a new release, keeping such per-release branches allows us to support and maintain older released versions of the software. At these times, such branches work as a place to gather bug fixes (for their software versions) and create minor releases.

Not all the projects find utilizing per-release branches necessary. You can prepare a new release on the stable-work graduation branch, or use a separate repository in place of using a separate branch. Also, not all the projects require providing support for more than the latest version.

This type of branches is often named after the release it is intended for, for example, having names such as release-v1.4, or v1.4.x (it better not have the same name as tag for release, though).

Hotfix branches for security fixes

Hotfix branches are like release branches, but for unplanned releases. Their purpose is to act upon the undesired state of a live production or a widely deployed version, usually to resolve some critical bug in the production (usually a severe security bug). This type of branches can be considered a longer lived equivalent of the bugfix topic branches (see the Bugfix branches section of this chapter).

Per-customer or per-deployment branches

Let's say that some of your project's customers require a few customization tweaks, since they do things differently. Or perhaps, there are some deployment sites that have special requirements. Suppose that these customizations cannot be done by simply changing the configuration. You would then need to create separate the lines of development for these customers or customizations.

But you don't want these lines of development to remain separate. You expect that there will be changes that apply to all of them. One solution is to use one branch for each customization set, per customer or per deployment. Another would be to use separate repositories. Both solutions help maintain parallel lines of development and transfer changes from one line to another.

Automation branches

Say that you are working on a web application and you want to automate its deployment using a version control system. One solution would be to set up a daemon to watch a specific branch (for example the one named 'deploy') for changes. Updating such branch would automatically update and reload the application.

This is, of course, not the only possible solution. Another possibility would be to use a separate deploy repository and set up hooks there, so push would trigger refreshing of the web application. Or, you could configure a hook in a public repository so that push to a specific branch triggers redeployment (this mechanism is described in Chapter 11, Git Administration).

These techniques can be used also for continuous integration (CI); instead of deploying the application, pushing it into a specific branch would trigger the running of test suite (the trigger could be creating a new commit on this branch or merging into it).

Mob branches for anonymous push access

Having a branch in a remote repository (on server) with special treatment on push, is a technique that has many uses, including helping to collaborate. It can be used to enable controlled anonymous push access for a project.

Let's assume that you want to allow random contributors to push into the central repository. You would want, however, to do this in a managed way: one solution is to create a special mob branch or a mob/* namespace (set of branches) with relaxed access control.

You can find how to set this up in Chapter 11, Git Administration.

The orphan branch trick

All the types of branches described up to this point differed in their purpose and management. However, from the technical point of view (from the point of view of the graph of commits), they all look the same. This is not the case with the so-called orphan branches.

The orphan branch is a parallel disconnected (orphaned) line of development, sharing no revisions with the main history of a project. It is a reference to a disjoint subgraph in the DAG of revisions, without any intersection with the main DAG graph. In most cases, their checkout is also composed of different files.

Such branches are sometimes used as a trick to store tangentially related contents in a single repository, instead of using separate repositories. (When using separate repositories to store related contents, one might want to use some naming convention to denote this fact, for example a common prefix.) They can be used to:

  • Store the project's web page files. For example, GitHub uses a branch named gh-pages for the project's pages.
  • Store generated files, when the process of creating them requires some nonstandard toolchain. For example, the project documentation can be stored in html, man, and pdf orphan branches (the html branch can be also used to deploy the documentation). This way the user can get it without needing to install its toolchain.
  • Store the project TODO notes (for example in the todo branch), perhaps together with storing there some specialized maintainer tools (scripts).

You can create such branch with git checkout --orphan <new branch>, or by pushing into (or fetching into) a specific branch from a separate repository, as follows:

$ git fetch repo-htmldocs master:html

Note

Creating an orphan branch with git checkout --orphan does not technically create a branch, that is, it does not make a new branch reference. What it does is point the symbolic reference HEAD to an unborn branch. The reference is created after the first commit on a new orphan branch.

That is why there is no option to create an orphan branch for git branch command.

Short-lived branches

While long-lived branches stay forever, short-lived or temporary branches are created to deal with single issues, and are usually removed after dealing with said issue. They are intended to last only as long as the issue is present. Their purpose is time-limited.

Because of their provisional nature, they are usually present only in the local private repository of a developer or integration manager (maintainer), and are not pushed to public distribution repositories. If they appear in public repositories, they are there only in a public repository of an individual contributor (see the blessed repository workflow in Chapter 5, Collaborative Development with Git), as a target for a pull request.

Topic or feature branches

Branches are used to separate and gather together different subsets of development efforts. With easy branching and merging, we can go further than creating a branch for each stability level, as described earlier. We can create a separate branch for each separate issue.

The idea is to make a new branch for each topic, that is, a feature or a bug fix. The intent of this type of branch is both to gather together subsequent development steps of a feature (where each step – a commit – should be a self contained piece, easy to review) and to isolate the work on one feature from the work on other topics. Using a feature branch allows topical changes to be kept together and not mixed with other commits. It also makes it possible for a whole topic to be dropped (or reverted) as a unit, be reviewed as a unit, and be accepted (integrated) as a unit.

The end goal for the commits on a topic branch is to be included in a released version of a product. This means that, ultimately, the short-lived topic branch is to be merged into the long-lived branch which is gathering stable work, and to be deleted. To make it easier to integrate topic branches, the recommended practice is to create such branches by forking off the oldest, the most stable integration branch that you will eventually merge it into. Usually, this means creating a branch from the stable-work graduation branch. However, if a given feature does depend on a topic not yet in the stable line, you need to fork off the appropriate topic branch containing the dependency you need.

Note that if it turns out that you forked off the wrong branch, you can always fix it by rebasing (see Chapter 7, Merging Changes Together, and Chapter 8, Keeping History Clean), as topic branches are not public.

Bugfix branches

We can distinguish a special case of a topic branch whose purpose is fixing a bug. Such branch should be created starting from the oldest integration branch it applies to (the most stable branch that contains the bug). This usually means forking off the maintenance branch, or off the divergence point of all the integration branches, rather than the tip of the stable branch. A bugfix branch's goal is to be merged into relevant long-lived integration branches.

Bugfix branches can be thought of as a short-lived equivalent of a long-lived hotfix branch.

Using them is a better alternative to simply committing fixes on the maintenance branch (or another appropriate integration branch).

Detached HEAD – the anonymous branch

You can think of the detached HEAD state (described in Chapter 3, Developing with Git) as the ultimate in temporary branches—so temporary that it even doesn't have a name. Git uses such anonymous branches automatically in a few situations, for example, during bisection and rebasing.

Because, in Git, there is only one anonymous branch and it must always be the current branch, It is usually better to create a true temporary branch with a temporary name; you can always change the name of the branch later.

One possible use of the detached HEAD is for proof of concept work. You, however, need to remember to set the name of the branch if the changes turn out to be worthwhile (or if you need to switch branches). It is easy to go from an anonymous branch to a named branch. You simply need to create a new branch from the current detached HEAD state.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset