Resolving merge conflicts

Merging in Git is typically fairly easy. Since Git stores and has access to the full graph of revisions, it can automatically find where the branches diverged, and merge only those divergent parts. This works even in the case of repeated merges, so you can keep a very long-lived branch up to date by repeatedly merging into it or by rebasing it on top of new changes.

However, it is not always possible to automatically combine changes. There are problems that Git cannot solve, for example because there were different changes to the same area of a file on different branches: these problems are called merge conflicts. Similarly, there can be problems while reapplying changes, though you would still get merge conflicts in case of problems.

The three-way merge

Unlike some other version control systems, Git does not try to be overly clever about merge conflict resolutions, and does not try to solve them all automatically. Git's philosophy is to be smart about determining the cases when a merge can be easily done automatically (for example, taking renames into account), and if automatic resolution is not possible, to not be overly clever about trying to resolve it. It is better to bail out and ask users to resolve merge, perhaps unnecessary with a smart algorithm, than to automatically create an incorrect one.

Git uses the three-way merge algorithm to come up with the result of the merge, comparing the common ancestors (base), side merged in (theirs), and side merged into (ours). This algorithm is very simple, at least at the tree level, that is, the granularity level of files. The following table explains the rules of the algorithm:

ancestor (base)

HEAD (ours)

branch (theirs)

result

A

A

A

A

A

A

B

B

A

B

A

B

A

B

B

B

A

B

C

merge

The rules for the trivial tree-level three-way merges are (see the preceding table):

  • If only one side changes a file, take the changed version
  • If both the sides have the same changes, take the changed version
  • If one side has a different change from the other, there is merge conflict at the contents level

It is a bit more complicated if there are more than one ancestor or if a file is not present in all the versions. But usually it is enough to know and understand these rules.

If one side changed the file differently from the other (where the type of the change counts, for example, renaming a file on one branch doesn't conflict with the changing contents of the file on the other branch), Git tries to merge the files at the contents level, using the provided merge driver if it is defined, and the contents level three-way merge otherwise (for text files).

The three-way file merge examines whether the changes touch different parts of the file (different lines are changed, and these changes are well separated by more than diff context sizes away from each other). If these changes are present in different parts of the file, Git resolves the merge automatically (and tells us which files are automerged).

However, if you changed the same part of the same file differently in the two branches you're merging together, Git won't be able to merge them cleanly:

$ git merge i18n
Auto-merging src/rand.c
CONFLICT (content): Merge conflict in src/rand.c
Automatic merge failed; fix conflicts and then commit the result.

Examining failed merges

In the case Git is unable to automatically resolve a merge (or if you have passed the --no-commit option to the git merge command), it would not create a merge commit. It will pause the process, waiting for you to resolve the conflict.

You can then always abort the process of merging with git merge --abort, in modern Git. With the older version, you would need to use git reset and delete .git/MERGE_HEAD.

Conflict markers in the worktree

If you want to see which files are yet unmerged at any point after a merge conflict, you can run git status:

$ git status
On branch master
You have unmerged paths.
  (fix conflicts and run "git commit")

Unmerged paths:
  (use "git add <file>..." to mark resolution)

    both modified:      src/rand.c

no changes added to commit (use "git add" and/or "git commit -a")

Anything that has not been resolved is listed as unmerged. In the case of content conflicts, Git uses standard conflict markers, putting them around the place of conflict with the ours and theirs version of the conflicted area in question. Your file will contain a section that would look somewhat like the following:

<<<<<<< HEAD:src/rand.c
fprintf(stderr, "Usage: %s <number> [<count>]
", argv[0]);
=======
fprintf(stderr, _("Usage: %s <number> [<count>
"), argv[0]);
>>>>>>> i18n:src/rand.c

This means that the ours version on the current branch (HEAD) in the src/rand.c file is there at the top of this block between the <<<<<<< and ======= markers, while the theirs version on the i18n branch being merged (also from src/rand.c) is there at the bottom part between the ======= and >>>>>>> markers.

You need to replace this whole block by the resolution of the merge, either by choosing one side (and deleting the rest) or combining both changes, for example:

fprintf(stderr, _("Usage: %s <number> [<count>]
"), argv[0]);

To help you avoid committing unresolved changes by mistake, Git by default checks whether committed changes include something that looks like conflict markers, refusing to create a merge commit without --no-verify if it finds them.

If you need to examine a common ancestor version to be able to resolve a conflict, you can switch to diff3 like conflict markers, which have an additional block:

<<<<<<< HEAD:src/rand.c
fprintf(stderr, "Usage: %s <number> [<count>]
", argv[0]);
|||||||
fprintf(stderr, "Usage: %s <number> [<count>
", argv[0]);
=======
fprintf(stderr, _("Usage: %s <number> [<count>
"), argv[0]);
>>>>>>> i18n:src/rand.c

You can replace merge conflict markers individually on a file-per-file basis by rechecking the file again with the following command:

$ git checkout --conflict=diff3 src/rand.c

If you prefer to use this format all the time, you can set it as the default for future merge conflicts, by setting merge.conflictStyle to diff3 (from the default of merge).

Three stages in the index

But how does Git keep track of which files are merged and which are not? Conflict markers in the working directory files would not be enough. Sometimes, there are legitimate contents that look like commit markers (for example, test files for merge, or files in the AsciiDoc format), and there are more conflict types than CONFLICT(content). How does Git, for example, represent the case where both sides renamed the file but in a different way, or where one side changed the file and the other side removed it?

It turns out that it is another use for the staging area of the commit (a merge commit in this case), which is also known as the index. In the case of conflicts, Git stores all of conflicted files versions in the index under stages; each stage has a number associated with it. Stage 1 is the common ancestor (base), stage 2 is the merged into version from HEAD, that is, the current branch (ours), and stage 3 is from MERGE_HEAD, the version you're merging in (theirs).

You can see these stages for the unmerged files with the low level (plumbing) command git ls-files --unmerged (or for all the files with git ls-files --stage):

$ git ls-files --unmerged
100755 ac51efdc3df4f4fd318d1a02ad05331d8e2c9111 1  src/rand.c
100755 36c06c8752c78d2aaf89571132f3bf7841a7b5c3 2  src/rand.c
100755 e85207e04dfdd50b0a1e9febbc67fd837c44a1cd 3  src/rand.c

You can refer to each version with the :<stage number>:<pathname> specifier. For example, if you want to view a common ancestor version of src/rand.c, you can use the following:

$ git show :1:src/rand.c

If there is no conflict, the file is in stage 0 of the index.

Examining differences – the combined diff format

You can use the status command to find which files are unmerged, and conflict markers do a good job of showing conflicts. How to see only conflicts before we work on them, and how to see how they were resolved? The answer is git diff.

One thing to remember is that for merges, even the merges in progress, Git will show the so-called combined diff format. It will look as follows (for a conflicted file during a merge):

$ git diff
diff --cc src/rand.c
index 293c8fc,4b87d29..0000000
--- a/src/rand.c
+++ b/src/rand.c
@@@ -14,16 -14,13 +14,26 @@@ int main(int argc, char *argv[]
      return EXIT_FAILURE;
    }
  
++<<<<<<< HEAD
 +  int max = atoi(argv[1]);
 +  if (max > RAND_MAX) {
 +    fprintf(stderr, "Cannot handle <number> larger than %d (%d)
",
 +            RAND_MAX, max);
 +    return EXIT_FAILURE;
 +  } else if (max < 2) {
 +    fprintf(stderr, "<number> cannot be smaller than %d (%d)
",
 +            2, max);
 +    return EXIT_FAILURE;
 +  }
++=======
+   char *endptr = NULL;
+   long int val = strtol(argv[1], &endptr, 10);
+   if (*endptr) {
+     fprintf(stderr, "Invalid argument(s)
");
+     return EXIT_FAILURE;
+   }
+   int max = (int) val;
++>>>>>>> 8c4ceca59d7402fb24a672c624b7ad816cf04e08
  
    srand(time(NULL));
    int result = random_int(max)

You can see a few differences from the ordinary unified diff format described in Chapter 3, Developing with Git. First, it uses diff --cc in the header to denote that it uses the compact combined format (it uses diff --combined instead if you used the git diff -c command). The extended header lines, such as index 293c8fc,4b87d29..0000000, take into account that there is more than one source version. The chunk header, @@@ -14,16 -14,13 +14,26 @@@, is modified (different from the one for the ordinary patch) to prevent people from trying to apply a combined diff as unified diff, for example, with the patch -p1 command.

Each line of the diff command is prefixed by two or more characters (two in the most common cases of merging two branches): the first character tells about the state of the line in the first preimage (ours) as compared to the result, the second character tells about the other preimage (theirs), and so on. For example, ++ means that the line was not present in either of versions being merged (here, in this example, you can find it on the line with the conflict marker).

Examining differences is even more useful for checking the resolution of a merge conflict.

To compare the result (current state of the working directory) with the version from the current branch (merged into), that is, ours version, you can use git diff --ours; similarly, for the version being merged (theirs), and the common ancestor version (base).

How do we get there: git log --merge

Sometimes, we need more context to decide which version to choose or to otherwise resolve a conflict. One such technique is reviewing a little bit of history, to remember why the two lines of development being merged were touching the same area of code.

To get the full list of divergent commits that were included in either branch, we can use the triple-dot syntax that you learned in Chapter 2, Exploring Project History, adding the --left-right option to make Git show which side the given commit belongs to:

$ git log --oneline --left-right HEAD...MERGE_HEAD

We can further simplify this and limit the output to only those commits that touched at least one of the conflicted files, with a --merge option to git log, for example:

$ git log --oneline --left-right --merge

This can be really helpful in quickly giving you the context you need to help understand why something conflicts and how to more intelligently resolve it.

Avoiding merge conflicts

While Git prefers to fail to automerge in a clear way, rather than to try elaborate merge algorithms, there are a few tools and options that one can use to help Git avoid merge conflicts.

Useful merge options

One of the problems while merging branches might be that they use different end of line normalization or clean/smudge filters (see Chapter 4, Managing Your Worktree). This might happen when one branch added such configuration (changing gitattributes file), while the other did not. In the case of end of line character configuration changes, you would get a lot of spurious changes, where lines differ only in the EOL characters. In both cases, while resolving a three-way merge, you can make Git run a virtual check out and check in of all the three stages of a file. This is done by passing the renormalize option to the recursive merge strategy (git merge -Xrenormalize). This would, as the name suggests, normalize end of line characters, and make them the same for all stages.

Changing end of line can lead to what can be considered a part of whitespace-related conflicts. It's pretty easy to tell that it is the case while looking at the conflict, because every line is removed on one side and added again on the other, and git diff --ignore-whitespace shows a more manageable conflict (or even a conflict that is resolved). If you see that you have a lot of whitespace issues in a merge, you can abort and redo it, but this time, with -Xignore-all-space or -Xignore-space-change. Note that whitespace changes mixed with other changes to a line are not ignored.

Sometimes, mismerges occur due to unimportant matching lines (for example, braces from distinct functions). You can make Git spend more time minimizing differences by selecting patience diff algorithm with -Xpatience or -Xdiff-algorithm=patience.

If the problem is misdetected renames, you can adjust the rename threshold with -Xrename-threshold=<n>.

Rerere – reuse recorded resolutions

The rerere (reuse recorded resolutions) functionality is a bit of a hidden feature. As the name of the feature implies, it makes Git remember how each conflict was resolved chunk by chunk, so that the next time Git sees the same conflict it would be able to resolve it automatically. Note, however, that Git will stop at resolving conflicts and that it does not autocommit the said rerere-based resolution, even if it resolves it cleanly (if it is superficially correct).

Such a functionality is useful in many scenarios. One example is the situation when you want a long-lived (long development) branch to merge cleanly at the end, but you do not want to create intermediate merge commits. In this situation, you can do trial merges (merge, then delete merge), saving information about how merge conflicts were resolved to the rerere cache. With this technique, the final merge should be easy, because most of it would be cleanly resolved from the resolutions recorded earlier.

Another situation you can make use of the rerere cache, is when you merge a bunch of topic branches into a testable permanent branch. If the integration test for a branch fails, you would want to be able to rewind the failed branch, but you would rather not lose the work spent on resolving a merge.

Or perhaps, you have decided that you rather use rebase than merge. The rerere mechanism allows us to translate the merge resolution to the rebase resolution.

To enable this functionality, simply set rerere.enabled to true, or create the .git/rr-cache file.

Dealing with merge conflicts

Let's assume that Git was not able to automerge cleanly, and that there are merge conflicts that you need to resolve to be able to create a new merge commit. What are your options?

Aborting a merge

First, let's cover how to get out of this situation. If you weren't perhaps prepared for conflicts or if you do not know enough about how to resolve them, you can simply back out from the merge you started with git merge --abort.

This command tries to reset to the state before you started a merge. It might be not able to do this if you have not started from a clean state. Therefore it is better to stash away changes, if there are any, before performing a merge operation.

Selecting ours or theirs version

Sometimes, it is enough to choose one version in the case of conflicts. If you want to have all the conflicts resolved this way, forcing all the chunks to resolve in favor of the ours or theirs version, you can use the -Xours (or -Xtheirs) option or the recursive merge strategy. Note that -Xours (merge option) is different from -s ours (merge strategy); the latter creates a fake merge, where the merge contents are the same as the ours version, instead of taking ours version only for conflicted files.

If you want to do this only for selected files, you can recheckout the file with the ours or theirs version with git checkout --ours / --theirs.

You can examine the base, ours, or theirs version with git show :1:file, :2:file, :3:file, respectively.

Scriptable fixes – manual file remerging

There are types of changes that Git can't handle automatically, but they are scriptable fixes. The merge can be done automatically, or at least is much easier, if we could transform the "ours", "theirs" and "base" version first. Renormalization after changing how the file is checked out and stored in the repository (eol and clean/smudge filters) and handling the whitespace change are built-in options. Another non built-in example could be changing the encoding of a file, or other scriptable set of changes such as renaming variables.

To perform a scripted merge, first you need to extract a copy of each of these versions of the conflicted file, which can be done, with the git show command and a :<stage>:<file> syntax:

$ git show :1:src/rand.c >src/rand.common.c
$ git show :2:src/rand.c >src/rand.ours.c
$ git show :3:src/rand.c >src/rand.theirs.c

Now that you have in the working area the contents of all the three stages of the files, you can fix each version individually, for example with dos2unix or with iconv, and so on. You can then remerge the contents of the file with the following:

$ git merge-file -p 
    rand.ours.c rand.common.c rand.theirs.c >rand.c

Using graphical merge tools

If you want to use a graphical tool to help you resolve merge conflicts, you can run git mergetool, which fires up a visual merge tool and guides invoked tool through all the merge conflicts.

It has a wide set of preconfigured support for various graphical merge helpers. You can configure which tool you want to use with merge.tool. If you don't do this, Git would try all the possible tools in the sequence which depends on the operating system and the desktop environment.

You can also configure a set up for your own tool.

Marking files as resolved and finalizing merges

As described earlier, if there is a merge conflict for a file, it will have three stages in the index. To mark a file as resolved, you need to put the contents of a file in stage 0. This can be done by simply running git add <file>.

When all the conflicts get resolved, you need to simply run git commit to finalize the merge commit (or you can skip marking each file individually as resolved and just run git commit -a). The default commit message for merge summarizes what we are merging, including a list of the conflicts if any, and adds a shortlog of the merged-in branches by default. The last is controlled by the --log option and the merge.log configuration variable.

Resolving rebase conflicts

When there is a problem with applying a patch or a patch series, cherry-picking or reverting a commit, or rebasing a branch, Git will fall back to using the three-way merge algorithm. How to resolve such conflicts is described in earlier sections.

However, for some of these methods, such as rebase, applying mailbox, or cherry-picking a series of commits, that are done stage by stage (a sequencer operation), there are other issues, namely, what to do if there is a conflict during such an individual stage.

You have three options. You can resolve the conflict, and continue the operation with the --continue parameter (or in case of git am, also --resolved). You can abort the operation and reset HEAD to the original branch with --abort. Finally, you can use --skip to drop a revision, perhaps because it is already present in the upstream and we can drop it during replaying.

git-imerge – incremental merge and rebase for git

Both rebase and merge have their disadvantages. With merge, you need to resolve one big conflict (though using test merges and rerere to keep up-to-date proposed resolutions could help with this) in an all-or-nothing fashion. There is almost no way to save partially a done merge or to test it; git stash can help, but it might be an inadequate solution.

Rebase, on the other hand, is done in step-by-step fashion. But it is unfriendly to collaboration; you should not rebase published parts of the history. You can interrupt a rebase, but it leaves you in a strange state (on an anonymous branch).

That's why the git imerge third-party tool was created. It presents conflicts pair wise in small steps. It records all the intermediate merges in such a way that they can be shared, so one person can start merging and the other can finish it. The final resolution can be stored as an ordinary merge, as an ordinary rebase, and as a rebase with history.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset