Merge two SVN repositories

There was a point in time when I created a copy of a project and it was then committed into another repository. This as such is not a big problem, but merging those repositories back together while keeping all the changes in the history is a challenge.

The challenge

Subversion does not support the combinig of two repositories. This is because of the way subversion stores revisions. When you have two repositories to combine, it is important to understand that the revisions of the same directory of the two repositories can not merged into each other, but you can merge two repositories into one by importing the two repositories into two different directories in a repository. Lets assume we have the following source repositories, where repository A was the first, which was later moved to repository B.

repo_A
|-- branch
|-- tags
\-- trunk
    |-- file1.txt
    \-- file2.txt
repo_B
|-- branch
|-- tags
|   |-- tag_1
|   \-- tag_2
\-- trunk
    |-- file1.txt
    \-- file2.txt

Notice that both repositories contain the same files in the trunk. If we want to combine these repositories, we can not merge the two trunk directories into one, but what we can do is to merge both repositories and their history into one repository. The resulting repository might look like the following:

repo_combo
|-- branch
|   \-- repo_A_trunk          (created from trunk of repository A)
|       |-- file1.txt
|       \-- file2.txt
|-- tags
|   |-- tag_1                 (the tag from repository B)
|   \-- tag_2                 (the tag from repository B)
\-- trunk                     (the trunk from repository B)
    |-- file1.txt
    \-- file2.txt

Subversion does not have a fixed structure which allows you to place the “trunk” from repo_A wherever you want. This is just an example I used to merge them.

This is how it works

The following steps will explain the procedure to merge the two repositories. As you will see, this procedure will dump both repositories and merge them into a completely new repository.

warningIt would be possible to import one repository directly into the other one but for safety reasons I decided not to do that. With this procedure you always have the possibility to go back to the two unchanged repositories in case something goes wrong or you forgot to merge something something.

To start merging the repositories, we need a dump from each of them. This is done with the following commands:

svnadmin dump file:///path/to/repo_A/ >repo_A.dump 
svnadmin dump file:///path/to/repo_B/ >repo_B.dump

This dump files contains all the commits from the whole repositories. This is actually a full copy of the complete repository in one single file. Earlier I explained the structure we want to go for; so the content of repository A should be in the “branch” directory of the new repository but without the “tags” and “branch” directories of repo_A.

So the next step is to filter the unneeded content out of the dump files. In this case the “tags” and “branch” are not needed, we only want the content of the “trunk” directory, and for repo_B we need the “trunk” and “tags”, but we don’t need the “branch” directory. This is done with the following command. For repo_A it defines only the content that should be included, while for repo_B it defines what should be excluded. These commands will also cause the dump to exclude revisions that do not contain any changes as well as renumbering the revisions. If you don’t want that, omit the appropriate parameter.

cat repo_A.dump | svndumpfilter include "trunk" --drop-empty-revs --renumber-revs >repo_A_trunk.dump
cat repo_B.dump | svndumpfilter exclude "branch" --drop-empty-revs --renumber-revs >repo_B_trunk_tags.dump

If you need any other directories from the dump as well, you need to adapt the filter accordingly. See the svndumpfilter manpage for details.

Now we need to build up the new repository structure as described above. To do so we need to create a new repository and check this out locally to build up the structure.

svnadmin create /path/to/repo_combined
svn checkout file:///path/to/repo_combined/ /path/to/checkout_combined

Create the structure of the new repository as usual with a new repository. After the directory structure has been created, it needs to be added and committed to the repository. All this is done with the following commands:

cd /path/to/checkout_combined
mkdir branch
mkdir branch/repo_A_trunk
mkdir tags
mkdir trunk
svn add *
svn commit -m "commit message for the structure of the new repository"

Now that we have the structure created we can load the dump into the new repository. When doing this, the parent directory you load it to needs to already exist. That’s why we needed to create the parent directory (“branch/repo_A_trunk”) before loading the dump.

svnadmin load repo_combined --parent-dir branch/repo_A_trunk --ignore-uuid <repo_A_trunk.dump
svnadmin load repo_combined --ignore-uuid <repo_B_trunk.dump

After this, the repository contains the “trunk” from repository A and from repository B, the “trunk” and “branch”. Repository A is located at /branch/repo_A_trunk and the “trunk” from repository B is in the “trunk” of the new repository. By adding first repo_A and afterwards the repo_B dump, we keep the revisions in their chronological order.

To check this has all worked, just execute “svn update” in the already checked out directory. With “svn log -v” you will then be able to print the complete history.

Alternative structure

Of course, you can use the same procedure to create a structure like the following just by not filtering out anything, and load the repositories into the directories repo_A and repo_B.

repo_combined
|-- repo_A
|   |-- branch
|   |-- tags
|   |-- trunk
|       |-- file1.txt
|       \-- file2.txt
\-- repo_B
    |-- branch
    |-- tags
    |   |-- tag_1
    |   \-- tag_2
    \-- trunk
        |-- file1.txt
        \-- file2.txt

With this procedure you can create any structure you want, but keep in mind that you can not load a dump into a directory which does not already exist in the repository or which already contains files with the same names as those you would be importing.


Read more of my posts on my blog at http://blog.tinned-software.net/.

This entry was posted in Version control system and tagged , , , . Bookmark the permalink.