Using Git subtrees to split a repository

We are in a position where we needed to create a new back-end server for an application. The current application is on a MEAN stack (Mongodb, Expressjs, Angularjs, Node.js), but a new client wants the backend to be deployed onto a JBoss server.  This created a situation where we needed a completely different backend, but the front-end was shared between them.  The approach we opted for was using git subtrees to split the ui code into its own repository and shared between the nodejs repo and the Java repo.  We did this by using the subtree features in git.

To be clear, I would only use this for very specific situations like this.  If possible, keeping things simple in a single repository is usually best.  But if you’re in the same situation, hopefully this will be helpful for you.

Splitting the Original Repository

The subtree commands effectively take a folder and split to another repository.  Everything you want in the subtree repo will need to be in the same folder. For the sake of this example, let’s assume you have a /lib folder that you want to extract to a separate repo.

Create a new folder and initialize a bare git repo:

mkdir lib-repo
cd lib-repo
git init --bare

Create a remote repository in github or wherever for lib project and add that as the origin remote.

From within your parent project folder, use the subtree split command and put the lib folder in a separate branch:

git subtree split --prefix=lib -b split --squash

Push the contents to the of the split branch to your newly created bare repo using the file path to the repository.

git push ~/lib-repo split:master

This will push the split branch to your new repo as the master branch

From lib-repo push to your origin remote

Now that lib folder lives in it’s new repository, you need to remove it from the parent repository and add the subtree back, from it’s new repository:

git remote add lib <url_to_lib_remote>
git rm -r lib
git add -A
git commit -am "removing lib folder"
git subtree add --prefix=lib lib master

 

Setting up a new user with the subtree

When a new user wants to work on your repository, they will need to setup the subtree repo manually.  What ends up happening is that the split off folder will live in two repositories:  the existing repo and the one setup as a subtree.  You need to explicitly commit changes to subtree.  This is obviously a mixed blessing.  If you have a repository with a few occasional committers, they can pull the original repository and push as if the subtree didn’t exist.  Then some one on the core team could occasionally push to the subtree.

If you want set up a core member who pushes to the subtree, clone the repository as normal:

 git clone <core_git_location>

You will also need to add a second remote repository that points to the rain-ui repository

git remote add lib <lib_git_location>

Once the repository is cloned, you need to remove the lib folder and commit the changes:

git rm -r lib
git add -A git commit -am "removing lib folder and contents"

Now you need to add the lib folder back, but this time using the subtree commands and the rain-ui repo

git subtree add --prefix=lib lib master

Breakdown: prefix defines the folder, ui is the name of the rain-ui remote, master is the branch you are pulling from the rain-ui remote

Pushing to the lib repository

If all you are doing is working on non-lib related items you are done, continue pushing to the main repository as necessary.  If you have changes in a different repository that is using the lib repo as a subtree and you  want to push changes upstream, use the following command:

git subtree push --prefix=lib <lib remote name> <branch name>
#following the example 
git subtree push --prefix=lib lib master

Pulling from the lib repository

If there are changes in the lib repository and you are not working in the main  repository, you would use the corresponding subtree pull command:

git subtree pull --prefix=lib <lib remote name> <branch name>
#following the example 
git subtree pull --prefix=lib lib master

References

Here is the list of reference I used during the process. When doing additional research on this topic, be aware of a different strategy called subtree merging, which is a different approach.

https://github.com/apenwarr/git-subtree/blob/master/git-subtree.txt

http://blogs.atlassian.com/2013/05/alternatives-to-git-submodule-git-subtree/

http://makingsoftware.wordpress.com/2013/02/16/using-git-subtrees-for-repository-separation/

Related Articles:

Post Footer automatically generated by Add Post Footer Plugin for wordpress.

This entry was posted in Uncategorized. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • me

    try git submodule…

    • John Teague

      We reviewed our options, including submodules. I’ll edit the blog explaining why subtrees are a better solution than submodules in most cases.

  • Pingback: Using Git subtrees to split a repository | Dot Net RSS

  • Eric

    If someone does not want to have the shared subtree history mixed into the history of each sharing parent repo, one or two additional options are appropriate.
    1) Use –squash with git subtree add or pull so that the change comes in as a single commit.
    2) If someone wants the convenience of using a “remote” name for the shared repo, they should use –no-tags when defining it, i.e. “git remote add –no-tags …”. Otherwise tags may be brought over from the subtree history into the remote tracking branch, which will cause subtree repo history to be mixed into displays of the parent repo DAG.

    • John Teague

      Great comments!! Thanks for the additional info.

  • Aaron

    Thanks for mentioning subtree merging (strategy) and subtree command itself is not the same thing.

    I have been highly confused reading different articles recommending the two different approaches, and left wondering are they the same thing achieving the same physical result, or are they operating on two different concepts altogether.

    The existing Git literature do not provide clear distinctions between the two.