Advanced Merging

Advanced Merging
Prev	Chapter 4. Branching and Merging	Next

Cherrypicking

Just as the term “changeset” is often used in version control systems, so is the term cherrypicking. This word refers to the act of choosing one specific changeset from a branch and replicating it to another. Cherrypicking may also refer to the act of duplicating a particular set of (not necessarily contiguous!) changesets from one branch to another. This is in contrast to more typical merging scenarios, where the “next” contiguous range of revisions is duplicated automatically.

Why would people want to replicate just a single change? It comes up more often than you'd think. For example, let's assume you've created a new feature branch /calc/branches/my-calc-feature-branch copied from /calc/trunk:

$ svn log ^/calc/branches/new-calc-feature-branch -v -r403
------------------------------------------------------------------------
r403 | user | 2013-02-20 03:26:12 -0500 (Wed, 20 Feb 2013) | 1 line
Changed paths:
   A /calc/branches/new-calc-feature-branch (from /calc/trunk:402)

Create a new calc branch for Feature 'X'.
------------------------------------------------------------------------

At the water cooler, you get word that Sally made an interesting change to main.c on the trunk. Looking over the history of commits to the trunk, you see that in revision 413 she fixed a critical bug that directly impacts the feature you're working on. You might not be ready to merge all the trunk changes to your branch just yet, but you certainly need that particular bug fix in order to continue your work.

$ svn log ^/calc/trunk -r413 -v
------------------------------------------------------------------------
r413 | sally | 2013-02-21 01:57:51 -0500 (Thu, 21 Feb 2013) | 3 lines
Changed paths:
   M /calc/trunk/src/main.c

Fix issue #22 'Passing a null value in the foo argument
of bar() should be a tolerated, but causes a segfault'.
------------------------------------------------------------------------

$ svn diff ^/calc/trunk -c413
Index: src/main.c
===================================================================
--- src/main.c  (revision 412)
+++ src/main.c  (revision 413)
@@ -34,6 +34,7 @@
…
# Details of the fix
…

Just as you used svn diff in the prior example to examine revision 413, you can pass the same option to svn merge:

$ cd new-calc-feature-branch

$ svn merge ^/calc/trunk -c413
--- Merging r413 into '.':
U    src/main.c
--- Recording mergeinfo for merge of r413 into '.':
 U   .

$ svn st
 M      .
M       src/main.c

You can now go through the usual testing procedures before committing this change to your branch. After the commit, Subversion updates the svn:mergeinfo on your branch to reflect that r413 was been merged to the branch. This prevents future automatic sync merges from attempting to merge r413 again. (Merging the same change to the same branch almost always results in a conflict!) Notice also the mergeinfo /calc/branches/my-calc-branch:341-379. This was recorded during the earlier reintegrate merge to /calc/trunk from the /calc/branches/my-calc-branch branch which we made in r380. When we created the my-calc-branch branch in r403, this mergeinfo was carried along with the copy.

$ svn pg svn:mergeinfo -v
Properties on '.':
  svn:mergeinfo
    /calc/branches/my-calc-branch:341-379
    /calc/trunk:413

Notice too that the mergeinfo doesn't list r413 as "eligible" to merge, because it's already been merged:

$ svn mergeinfo ^/calc/trunk --show-revs eligible
r404
r405
r406
r407
r409
r410
r411
r412
r414
r415
r416
…
r455
r456
r457

The preceding means that when the time finally comes to do an automatic sync merge, Subversion breaks the merge into two parts. First it merges all eligible merges up to revision 412. Then it merges all eligible revisions from revisions 414 to the HEAD revision. Because we already cherrypicked r413, that change is skipped:

$ svn merge ^/calc/trunk
--- Merging r403 through r412 into '.':
U    doc/INSTALL
U    src/main.c
U    src/button.c
U    src/integer.c
U    Makefile
U    README
--- Merging r414 through r458 into '.':
G    doc/INSTALL
G    src/main.c
G    src/integer.c
G    Makefile
--- Recording mergeinfo for merge of r403 through r458 into '.':
 U   .

This use case of replicating (or backporting) bug fixes from one branch to another is perhaps the most popular reason for cherrypicking changes; it comes up all the time, for example, when a team is maintaining a “release branch” of software. (We discuss this pattern in the section called “Release Branches”.)

	Warning
Did you notice how, in the last example, the merge invocation merged two distinct ranges? The svn merge command applied two independent patches to your working copy to skip over changeset 413, which your branch already contained. There's nothing inherently wrong with this, except that it has the potential to make conflict resolution trickier. If the first range of changes creates conflicts, you must resolve them interactively for the merge process to continue and apply the second range of changes. If you postpone a conflict from the first wave of changes, the whole merge command will bail out with an error message and you must resolve the conflict before running the merge a second time to get the remainder of the changes.

Warning

Did you notice how, in the last example, the merge invocation merged two distinct ranges? The svn merge command applied two independent patches to your working copy to skip over changeset 413, which your branch already contained. There's nothing inherently wrong with this, except that it has the potential to make conflict resolution trickier. If the first range of changes creates conflicts, you must resolve them interactively for the merge process to continue and apply the second range of changes. If you postpone a conflict from the first wave of changes, the whole merge command will bail out with an error message and you must resolve the conflict before running the merge a second time to get the remainder of the changes.

A word of warning: while svn diff and svn merge are very similar in concept, they do have different syntax in many cases. Be sure to read about them in svn Reference—Subversion Command-Line Client for details, or ask svn help. For example, svn merge requires a working copy path as a target, that is, a place where it should apply the generated patch. If the target isn't specified, it assumes you are trying to perform one of the following common operations:

You want to merge directory changes into your current working directory.
You want to merge the changes in a specific file into a file by the same name that exists in your current working directory.

If you are merging a directory and haven't specified a target path, svn merge assumes the first case and tries to apply the changes into your current directory. If you are merging a file, and that file (or a file by the same name) exists in your current working directory, svn merge assumes the second case and tries to apply the changes to a local file with the same name.

Merge Syntax: Full Disclosure

You've now seen some examples of the svn merge command, and you're about to see several more. If you're feeling confused about exactly how merging works, you're not alone. Many users (especially those new to version control) are initially perplexed about the proper syntax of the command and about how and when the feature should be used. But fear not, this command is actually much simpler than you think! There's a very easy technique for understanding exactly how svn merge behaves.

The main source of confusion is the name of the command. The term “merge” somehow denotes that branches are combined together, or that some sort of mysterious blending of data is going on. That's not the case. A better name for the command might have been svn diff-and-apply, because that's all that happens: two repository trees are compared, and the differences are applied to a working copy.

If you're using svn merge to do basic copying of changes between branches, an automatic merge will generally do the right thing. For example, a command such as the following,

$ svn merge ^/calc/branches/some-branch

will attempt to duplicate any changes made on some-branch into your current working directory, which is presumably a working copy that shares some historical connection to the branch. The command is smart enough to only duplicate changes that your working copy doesn't yet have. If you repeat this command once a week, it will only duplicate the “newest” branch changes that happened since you last merged.

If you choose to use the svn merge command in all its full glory by giving it specific revision ranges to duplicate, the command takes three main arguments:

An initial repository tree (often called the left side of the comparison)
A final repository tree (often called the right side of the comparison)
A working copy to accept the differences as local changes (often called the target of the merge)

Once these three arguments are specified, then the two trees are compared and the differences applied to the target working copy as local modifications. When the command is done, the results are no different than if you had hand-edited the files or run various svn add or svn delete commands yourself. If you like the results, you can commit them. If you don't like the results, you can simply svn revert all of the changes.

The syntax of svn merge allows you to specify the three necessary arguments rather flexibly. Here are some examples:

$ svn merge http://svn.example.com/repos/branch1@150 \
            http://svn.example.com/repos/branch2@212 \
            my-working-copy

$ svn merge -r 100:200 http://svn.example.com/repos/trunk my-working-copy

$ svn merge -r 100:200 http://svn.example.com/repos/trunk

The first syntax lays out all three arguments explicitly, naming each tree in the form URL@REV and naming the working copy target. The second syntax is used as a shorthand for situations when you're comparing two different revisions of the same URL. This type of merge is referred to (for obvious reasons) as a “2-URL” merge. The last syntax shows how the working copy argument is optional; if omitted, it defaults to the current directory.

While the first example shows the “full” syntax of svn merge, use it very carefully; it can result in merges which do not record any svn:mergeinfo metadata at all. The next section talks a bit more about this.

Merges Without Mergeinfo

Subversion tries to generate merge metadata whenever it can, to make future invocations of svn merge smarter. There are still situations, however, where svn:mergeinfo data is not created or changed. Remember to be a bit wary of these scenarios:

Merging unrelated sources: If you ask svn merge to compare two URLs that aren't related to each other, a patch is still generated and applied to your working copy, but no merging metadata is created. There's no common history between the two sources, and future “smart” merges depend on that common history.
Merging from foreign repositories: While it's possible to run a command such as svn merge -r 100:200 http://svn.foreignproject.com/repos/trunk, the resultant patch also lacks any historical merge metadata. At the time of this writing, Subversion has no way of representing different repository URLs within the svn:mergeinfo property.
Using --ignore-ancestry: If this option is passed to svn merge, it causes the merging logic to mindlessly generate differences the same way that svn diff does, ignoring any historical relationships. We discuss this later in this chapter in the section called “Noticing or Ignoring Ancestry”.
Applying reverse merges from a target's natural history: Earlier in this chapter (the section called “Undoing Changes”) we discussed how to use svn merge to apply a “reverse patch” as a way of rolling back changes. If this technique is used to undo a change to an object's personal history (e.g., commit r5 to the trunk, then immediately roll back r5 using svn merge . -c -5), this sort of merge doesn't affect the recorded mergeinfo.^[40]

Natural History and Implicit Mergeinfo

As we mentioned earlier when discussing Mergeinfo Inheritance, a path that has the svn:mergeinfo property set on it is said to have “explicit” mergeinfo. Yes, this implies a path can have “implicit” mergeinfo, too! Implicit mergeinfo, or natural history, is simply a path's own history (see the section called “Xem Xét Lịch Sử”) interpreted as mergeinfo. While implicit mergeinfo is largely an implementation detail, it can be a useful abstraction for understanding merge tracking behavior.

Let's say you created ^/trunk in revision 100 and then later, in revision 201, created ^/branches/feature-branch as a copy of ^/trunk@200. The natural history of ^/branches/feature-branch contains all the repository paths and revision ranges through which the history of the new branch has ever passed:

/trunk:100-200
/branches/feature-branch:201

With each new revision added to the repository, the natural history—and thus, implicit mergeinfo—of the branch continues to expand to include those revisions until the day the branch is deleted. Here's what the implicit mergeinfo of our branch would look like when the HEAD revision of the repository had grown to 234:

/trunk:100-200
/branches/feature-branch:201-234

Implicit mergeinfo does not actually show up in the svn:mergeinfo property, but Subversion acts as if it does. This is why if you check out ^/branches/feature-branch and then run svn merge ^/trunk -c 58 in the resulting working copy, nothing happens. Subversion knows that the changes committed to ^/trunk in revision 58 are already present in the target's natural history, so there's no need to try to merge them again. After all, avoiding repeated merges of changes is the primary goal of Subversion's merge tracking feature!

More on Merge Conflicts

Just like the svn update command, svn merge applies changes to your working copy. And therefore it's also capable of creating conflicts. The conflicts produced by svn merge, however, are sometimes different, and this section explains those differences.

To begin with, assume that your working copy has no local edits. When you svn update to a particular revision, the changes sent by the server always apply “cleanly” to your working copy. The server produces the delta by comparing two trees: a virtual snapshot of your working copy, and the revision tree you're interested in. Because the left hand side of the comparison is exactly equal to what you already have, the delta is guaranteed to correctly convert your working copy into the right hand tree.

But svn merge has no such guarantees and can be much more chaotic: the advanced user can ask the server to compare any two trees at all, even ones that are unrelated to the working copy! This means there's large potential for human error. Users will sometimes compare the wrong two trees, creating a delta that doesn't apply cleanly. The svn merge subcommand does its best to apply as much of the delta as possible, but some parts may be impossible. A common sign that you merged the wrong delta is unexpected tree conflicts:

$ svn merge ^/calc/trunk -r104:115
--- Merging r105 through r115 into '.':
   C doc
   C src/button.c
   C src/integer.c
   C src/real.c
   C src/main.c
--- Recording mergeinfo for merge of r105 through r115 into '.':
 U   .
Summary of conflicts:
  Tree conflicts: 5

$ svn st
 M      .
!     C doc
      >   local dir missing, incoming dir edit upon merge
!     C src/button.c
      >   local file missing, incoming file edit upon merge
!     C src/integer.c
      >   local file missing, incoming file edit upon merge
!     C src/main.c
      >   local file missing, incoming file edit upon merge
!     C src/real.c
      >   local file missing, incoming file edit upon merge
Summary of conflicts:
  Tree conflicts: 5

In the previous example, it might be the case that doc and the four *.c files all exist in both snapshots of the branch being compared. The resultant delta wants to change the contents of the corresponding paths in your working copy, but those paths don't exist in the working copy. Whatever the case, the preponderance of tree conflicts most likely means that the user compared the wrong two trees or that you are merging to the wrong working copy target; both are classic signs of user error. When this happens, it's easy to recursively revert all the changes created by the merge (svn revert . --recursive), delete any unversioned files or directories left behind after the revert, and rerun svn merge with the correct arguments.

Also keep in mind that a merge into a working copy with no local edits can still produce text conflicts.

$ svn st

$ svn merge ^/paint/trunk -r289:291
--- Merging r290 through r291 into '.':
C    Makefile
--- Recording mergeinfo for merge of r290 through r291 into '.':
 U   .
Summary of conflicts:
  Text conflicts: 1
Conflict discovered in file 'Makefile'.
Select: (p) postpone, (df) diff-full, (e) edit, (m) merge,
        (mc) mine-conflict, (tc) theirs-conflict, (s) show all options: p

$ svn st
 M      .
C       Makefile
?       Makefile.merge-left.r289
?       Makefile.merge-right.r291
?       Makefile.working
Summary of conflicts:
  Text conflicts: 1

How can a conflict possibly happen? Again, because the user can request svn merge to define and apply any old delta to the working copy, that delta may contain textual changes that don't cleanly apply to a working file, even if the file has no local modifications.

Another small difference between svn update and svn merge is the names of the full-text files created when a conflict happens. In the section called “Giải quyết xung đột”, we saw that an update produces files named filename.mine, filename.rOLDREV, and filename.rNEWREV. When svn merge produces a conflict, though, it creates three files named filename.working, filename.merge-left.rOLDREV, and filename.merge-right.rNEWREV. In this case, the terms “merge-left” and “merge-right” are describing which side of the double-tree comparison the file came from, “rOLDREV” describes the revision of the left side, and “rNEWREV” the revision of the right side. In any case, these differing names help you distinguish between conflicts that happened as a result of an update and ones that happened as a result of a merge.

Blocking Changes

Sometimes there's a particular changeset that you don't want automatically merged. For example, perhaps your team's policy is to do new development work on /trunk, but is more conservative about backporting changes to a stable branch you use for releasing to the public. On one extreme, you can manually cherrypick single changesets from the trunk to the branch—just the changes that are stable enough to pass muster. Maybe things aren't quite that strict, though; perhaps most of the time you just let svn merge automatically merge most changes from trunk to branch. In this case, you want a way to mask a few specific changes out, that is, prevent them from ever being automatically merged.

To block a changeset you must make Subversion believe that the change has already been merged. To do this, invoke the merge subcommand with the --record-only option. The option makes Subversion record mergeinfo as if it had actually performed the merge, but no difference is actually applied:

$ cd my-calc-branch

$ svn merge ^/calc/trunk -r386:388 --record-only
--- Recording mergeinfo for merge of r387 through r388 into '.':
 U   .

# Only the mergeinfo is changed
$ svn st
 M      .

$ svn pg svn:mergeinfo -vR
Properties on '.':
  svn:mergeinfo
    /calc/trunk:341-378,387-388

$ svn commit -m "Block r387-388 from being merged to my-calc-branch."
Sending        .

Committed revision 461.

Since Subversion 1.7, --record-only merges are transitive. This means that, in addition to recording mergeinfo describing the blocked revision(s), any svn:mergeinfo property differences in the merge source are also applied. For example, let's say we want to block the 'paint-python-wrapper' feature from ever being merged from ^/paint/trunk to the ^/paint/branches/paint-1.0.x branch. We know the work on this feature was done on its own branch, which was reintegrated to /paint/trunk in revision 465:

$ svn log -v -r465 ^/paint/trunk
------------------------------------------------------------------------
r465 | joe | 2013-02-25 14:05:12 -0500 (Mon, 25 Feb 2013) | 1 line
Changed paths:
   M /paint/trunk
   A /paint/trunk/python (from /paint/branches/paint-python-wrapper/python:464)

Reintegrate Paint Python wrapper.
------------------------------------------------------------------------

Because revision 465 was a reintegrate merge we know that mergeinfo was recorded describing the merge:

$ svn diff ^/paint/trunk --depth empty -c465
Index: .
===================================================================
--- .   (revision 464)
+++ .   (revision 465)

Property changes on: .
___________________________________________________________________
Added: svn:mergeinfo
   Merged /paint/branches/paint-python-wrapper:r463-464

Now simply blocking merges of revision 465 from /paint/trunk isn't foolproof since someone could merge r462:464 directly from /paint/branches/paint-python-wrapper. Fortunately the transitive nature of --record-only merges prevents this; the --record-only merge applies the svn:mergeinfo diff from revision 465, thus blocking merges of that change directly from /paint/trunk and indirectly from /paint/branches/paint-python-wrapper:

$ cd paint/branches/paint-1.0.x

$ svn merge ^/paint/trunk --record-only -c465
--- Merging r465 into '.':
 U   .
--- Recording mergeinfo for merge of r465 into '.':
 G   .

$ svn diff --depth empty
Index: .
===================================================================
--- .   (revision 462)
+++ .   (working copy)

Property changes on: .
___________________________________________________________________
Added: svn:mergeinfo
   Merged /paint/branches/paint-python-wrapper:r463-464
   Merged /paint/trunk:r465

$ svn ci -m "Block the Python wrappers from the first release of paint."
Sending        .

Committed revision 466.

Now any subsequent attempts to merge the feature to /paint/trunk are inoperative:

$ svn merge ^/paint/trunk -c465
--- Recording mergeinfo for merge of r465 into '.':
 U   .

$ svn st # No change!

$ svn merge ^/paint/branches/paint-python-wrapper -r462:464
--- Recording mergeinfo for merge of r463 through r464 into '.':
 U   .

$ svn st  # No change!

$

If at a later time you realize that you actually do need the blocked feature merged to /paint/trunk you have a couple of choices. You can reverse merge r466, (the revision you blocked the feature), as we discussed in the section called “Undoing Changes”. Once you commit that change you can repeat the merge of r465 from /paint/trunk. Alternatively, you can simply repeat the merge of r465 from /paint/trunk using the --ignore-ancestry option, which will cause the merge to disregard any mergeinfo and simply apply the requested diff, see the section called “Noticing or Ignoring Ancestry”.

$ svn merge ^/paint/trunk -c465 --ignore-ancestry
--- Merging r465 into '.':
A    python
A    python/paint.py
 G   .

Blocking changes with --record-only works, but it's also a little bit dangerous. The main problem is that we're not clearly differentiating between the ideas of “I already have this change” and “I don't have this change, but don't currently want it.” We're effectively lying to the system, making it think that the change was previously merged. This puts the responsibility on you—the user—to remember that the change wasn't actually merged, it just wasn't wanted. There's no way to ask Subversion for a list of “blocked changelists.” If you want to track them (so that you can unblock them someday) you'll need to record them in a text file somewhere, or perhaps in an invented property.

Merge-Sensitive Logs and Annotations

One of the main features of any version control system is to keep track of who changed what, and when they did it. The svn log and svn blame subcommands are just the tools for this: when invoked on individual files, they show not only the history of changesets that affected the file, but also exactly which user wrote which line of code, and when she did it.

When changes start getting replicated between branches, however, things start to get complicated. For example, if you were to ask svn log about the history of your feature branch, it would show exactly every revision that ever affected the branch:

$ cd my-calc-branch

$ svn log -q
------------------------------------------------------------------------
r461 | user | 2013-02-25 05:57:48 -0500 (Mon, 25 Feb 2013)
------------------------------------------------------------------------
r379 | user | 2013-02-18 10:56:35 -0500 (Mon, 18 Feb 2013)
------------------------------------------------------------------------
r378 | user | 2013-02-18 09:48:28 -0500 (Mon, 18 Feb 2013)
------------------------------------------------------------------------
…
------------------------------------------------------------------------
r8 | sally | 2013-01-17 16:55:36 -0500 (Thu, 17 Jan 2013)
------------------------------------------------------------------------
r7 | bill | 2013-01-17 16:49:36 -0500 (Thu, 17 Jan 2013)
------------------------------------------------------------------------
r3 | bill | 2013-01-17 09:07:04 -0500 (Thu, 17 Jan 2013)
------------------------------------------------------------------------

But is this really an accurate picture of all the changes that happened on the branch? What's left out here is the fact that revisions 352, 362, 372 and 379 were actually the results of merging changes from the trunk. If you look at one of these logs in detail, the multiple trunk changesets that comprised the branch change are nowhere to be seen:

$ svn log ^/calc/branches/my-calc-branch -r352 -v
------------------------------------------------------------------------
r352 | user | 2013-02-16 09:35:18 -0500 (Sat, 16 Feb 2013) | 1 line
Changed paths:
   M /calc/branches/my-calc-branch
   M /calc/branches/my-calc-branch/Makefile
   M /calc/branches/my-calc-branch/doc/INSTALL
   M /calc/branches/my-calc-branch/src/button.c
   M /calc/branches/my-calc-branch/src/real.c

Sync latest trunk changes to my-calc-branch.
------------------------------------------------------------------------

We happen to know that this merge to the branch was nothing but a merge of trunk changes. How can we see those trunk changes as well? The answer is to use the --use-merge-history (-g) option. This option expands those “child” changes that were part of the merge.

$ svn log ^/calc/branches/my-calc-branch -r352 -v -g
------------------------------------------------------------------------
r352 | user | 2013-02-16 09:35:18 -0500 (Sat, 16 Feb 2013) | 1 line
Changed paths:
   M /calc/branches/my-calc-branch
   M /calc/branches/my-calc-branch/Makefile
   M /calc/branches/my-calc-branch/doc/INSTALL
   M /calc/branches/my-calc-branch/src/button.c
   M /calc/branches/my-calc-branch/src/real.c

Sync latest trunk changes to my-calc-branch.
------------------------------------------------------------------------
r351 | sally | 2013-02-16 08:04:22 -0500 (Sat, 16 Feb 2013) | 2 lines
Changed paths:
   M /calc/trunk/src/real.c
Merged via: r352

Trunk work on calc project.
------------------------------------------------------------------------
…
------------------------------------------------------------------------
r345 | sally | 2013-02-15 16:51:17 -0500 (Fri, 15 Feb 2013) | 2 lines
Changed paths:
   M /calc/trunk/Makefile
   M /calc/trunk/src/integer.c
Merged via: r352

Trunk work on calc project.
------------------------------------------------------------------------
r344 | sally | 2013-02-15 16:44:44 -0500 (Fri, 15 Feb 2013) | 1 line
Changed paths:
   M /calc/trunk/src/integer.c
Merged via: r352

Refactor the bazzle functions.
------------------------------------------------------------------------

By making the log operation use merge history, we see not just the revision we queried (r352), but also the other revisions that came along on the ride with it—Sally's work on trunk. This is a much more complete picture of history!

The svn blame command also takes the --use-merge-history (-g) option. If this option is neglected, somebody looking at a line-by-line annotation of src/button.c may get the mistaken impression that you were responsible for a particular change:

$ svn blame src/button.c
…
   352    user    retval = inverse_func(button, path);
   352    user    return retval;
   352    user    }
…

And while it's true that you did actually commit those three lines in revision 352, two of them were actually written by Sally back in revision 348 and were brought into your branch via a sync merge:

$ svn blame button.c -g
…
G    348    sally   retval = inverse_func(button, path);
G    348    sally   return retval;
     352    user    }
…

Now we know who to really blame for those two lines of code!

Noticing or Ignoring Ancestry

When conversing with a Subversion developer, you might very likely hear reference to the term ancestry. This word is used to describe the relationship between two objects in a repository: if they're related to each other, one object is said to be an ancestor of the other.

For example, suppose you commit revision 100, which includes a change to a file foo.c. Then foo.c@99 is an “ancestor” of foo.c@100. On the other hand, suppose you commit the deletion of foo.c in revision 101, and then add a new file by the same name in revision 102. In this case, foo.c@99 and foo.c@102 may appear to be related (they have the same path), but in fact are completely different objects in the repository. They share no history or “ancestry.”

The reason for bringing this up is to point out an important difference between svn diff and svn merge. The former command ignores ancestry, while the latter command is quite sensitive to it. For example, if you asked svn diff to compare revisions 99 and 102 of foo.c, you would see line-based diffs; the diff command is blindly comparing two paths. But if you asked svn merge to compare the same two objects, it would notice that they're unrelated and first attempt to delete the old file, then add the new file; the output would indicate a deletion followed by an add:

D    foo.c
A    foo.c

Most merges involve comparing trees that are ancestrally related to one another; therefore, svn merge defaults to this behavior. Occasionally, however, you may want the merge command to compare two unrelated trees. For example, you may have imported two source-code trees representing different vendor releases of a software project (see the section called “Vendor Branches”). If you ask svn merge to compare the two trees, you'd see the entire first tree being deleted, followed by an add of the entire second tree! In these situations, you'll want svn merge to do a path-based comparison only, ignoring any relations between files and directories. Add the --ignore-ancestry option to your merge command, and it will behave just like svn diff. (And conversely, the --notice-ancestry option will cause svn diff to behave like the svn merge command.)

	Tip
	The `--ignore-ancestry` option also disables Merge Tracking. This means that `svn:mergeinfo` is not considered when svn merge is determining what revisions to merge, nor is `svn:mergeinfo` recorded to describe the merge.

Merges and Moves

A common desire is to refactor source code, especially in Java-based software projects. Files and directories are shuffled around and renamed, often causing great disruption to everyone working on the project. Sounds like a perfect case to use a branch, doesn't it? Just create a branch, shuffle things around, and then merge the branch back to the trunk, right?

Alas, this scenario doesn't work so well right now and is considered one of Subversion's current weak spots. The problem is that Subversion's svn merge command isn't as robust as it should be, particularly when dealing with copy and move operations.

When you use svn copy to duplicate a file, the repository remembers where the new file came from, but it fails to transmit that information to the client which is running svn update or svn merge. Instead of telling the client, “Copy that file you already have to this new location,” it sends down an entirely new file. This can lead to problems, particularly tree conflicts in the case of renames, which involve not only the new copy, but a deletion of the old path—a lesser-known fact about Subversion is that it lacks “true renames”—the svn move command is nothing more than an aggregation of svn copy and svn delete.

For example, suppose that you want to make some changes on your private branch /calc/branch/my-calc-branch. First you perform an automatic sync merge with /calc/trunk and commit that in r470:

$ cd calc/trunk

$ svn merge ^/calc/trunk
--- Merging differences between repository URLs into '.':
U    doc/INSTALL
A    FAQ
U    src/main.c
U    src/button.c
U    src/integer.c
U    Makefile
U    README
 U   .
--- Recording mergeinfo for merge between repository URLs into '.':
 U   .

$ svn ci -m "Sync all changes from ^/calc/trunk through r469."
Sending        .
Sending        Makefile
Sending        README
Sending        FAQ
Sending        doc/INSTALL
Sending        src/main.c
Sending        src/button.c
Sending        src/integer.c
Transmitting file data ....
Committed revision 470.

Then you rename integer.c to whole.c in r471 and then make some edits to the same file in r473. Effectively you've created a new file in your branch (that is a copy of the original file plus some edits) and deleted the original file. Meanwhile, back on /calc/trunk, Sally has committed some improvements of her own to integer.c in r472:

$ svn log -v -r472 ^/calc/trunk
------------------------------------------------------------------------
r472 | sally | 2013-02-26 07:05:18 -0500 (Tue, 26 Feb 2013) | 1 line
Changed paths:
   M /calc/trunk/src/integer.c

Trunk work on integer.c.
------------------------------------------------------------------------

Now you decide to merge your branch back to the trunk. How will Subversion combine the rename and edits you made with Sally's edits?

$ svn merge ^/calc/branches/my-calc-branch
--- Merging differences between repository URLs into '.':
   C src/integer.c
 U   src/real.c
A    src/whole.c
--- Recording mergeinfo for merge between repository URLs into '.':
 U   .
Summary of conflicts:
  Tree conflicts: 1

$ svn st
 M      .
      C src/integer.c
      >   local file edit, incoming file delete upon merge
 M      src/real.c
A  +    src/whole.c
Summary of conflicts:
  Tree conflicts: 1

The answer is that Subversion won't combine those changes, but rather raises a tree conflict^[41]because it needs your help to figure out what part of your changes and what part of Sally's changes should ultimately end up in whole.c or even if the rename should take place at all!

You will need to resolve this tree conflict before committing the merge and this may require some manual intervention on your part, see the section called “Xử Lý Xung Đột Về Cấu Trúc”. The moral of this story is that until Subversion improves, be careful about merging copies and renames from one branch to another and when you do, be prepared for some manual resolution.

Preventing Naïve Clients from Committing Merges

If you've just upgraded your server to Subversion 1.5 or later, there's a risk that pre-1.5 Subversion clients can cause problems with Merge Tracking. This is because pre-1.5 clients don't support this feature; when one of these older clients performs svn merge, it doesn't modify the value of the svn:mergeinfo property at all. So the subsequent commit, despite being the result of a merge, doesn't tell the repository about the duplicated changes—that information is lost. Later on, when “merge-aware” clients attempt automatic merging, they're likely to run into all sorts of conflicts resulting from repeated merges.

If you and your team are relying on the merge-tracking features of Subversion, you may want to configure your repository to prevent older clients from committing changes. The easy way to do this is by inspecting the “capabilities” parameter in the start-commit hook script. If the client reports itself as having mergeinfo capabilities, the hook script can allow the commit to start. If the client doesn't report that capability, have the hook deny the commit. Example 4.1, “Merge-tracking gatekeeper start-commit hook script” gives an example of such a hook script:

Example 4.1. Merge-tracking gatekeeper start-commit hook script

#!/usr/bin/env python
import sys

# The start-commit hook is invoked immediately after a Subversion txn is
# created and populated with initial revprops in the process of doing a
# commit. Subversion runs this hook by invoking a program (script, 
# executable, binary, etc.) named 'start-commit' (for which this file
# is a template) with the following ordered arguments:
#
#   [1] REPOS-PATH   (the path to this repository)
#   [2] USER         (the authenticated user attempting to commit)
#   [3] CAPABILITIES (a colon-separated list of capabilities reported
#                     by the client; see note below)
#   [4] TXN-NAME     (the name of the commit txn just created)

capabilities = sys.argv[3].split(':')
if "mergeinfo" not in capabilities:
  sys.stderr.write("Commits from merge-tracking-unaware clients are "
                   "not permitted.  Please upgrade to Subversion 1.5 "
                   "or newer.\n")
  sys.exit(1)
sys.exit(0)

For more information about hook scripts, see the section called “Implementing Repository Hooks”.

The Final Word on Merge Tracking

The bottom line is that Subversion's merge-tracking feature has an complex internal implementation, and the svn:mergeinfo property is the only window the user has into the machinery.

How and when mergeinfo is recorded by a merge can sometimes be difficult to understand. Furthermore, the management of mergeinfo metadata has a whole set of taxonomies and behaviors around it, such as “explicit” versus “implicit ” mergeinfo, “operative” versus “inoperative” revisions, specific mechanisms of mergeinfo “elision,” and even “inheritance” from parent to child directories.

We've chosen to only briefly cover, if at all, these detailed topics for a couple of reasons. First, the level of detail is overwhelming for a typical user. Second, and more importantly, the typical user doesn't need to understand these concepts; typically they remain in the background as implementation details. All that said, if you enjoy this sort of thing, you can get a fantastic overview in a paper posted at CollabNet's website: http://www.open.collab.net/community/subversion/articles/merge-info.html.

For now, if you want to steer clear of the complexities of merge tracking, we recommend that you follow these simple best practices:

For short-term feature branches, follow the simple procedure described throughout the section called “Basic Merging”.
Avoid subtree merges and subtree mergeinfo. Perform merges only on the root of your branches, not on subdirectories or files (see the section called “Subtree Merges and Subtree Mergeinfo”) .
Don't ever edit the svn:mergeinfo property directly; use svn merge with the --record-only option to effect a desired change to the metadata (as demonstrated in the section called “Blocking Changes”).
Your merge target should be a working copy which represents the root of a complete tree representing a single location in the repository at a single point in time:
- Update before you merge! Don't use the --allow-mixed-revisions option to merge into mixed-revision working copies.
- Don't merge to targets with “switched” subdirectories (as described next in the section called “Traversing Branches”).
- Avoid merges to targets with sparse directories. Likewise, don't merge to depths other than --depth=infinity
- Be sure you have read access to all of the merge source and read/write access to all of the merge target.

Of course sometimes you may need to violate some of these best practices. Don't worry if you need to, just be sure you understand the ramifications of doing so.

Prev	Up	Next
Basic Merging	Home	Traversing Branches