Using Git subtrees to split a repository


We are in a position where we needed to create a new back-end server for an application. The current application is on a MEAN stack (Mongodb, Expressjs, Angularjs, Node.js), but a new client wants the backend to be deployed onto a JBoss server.  This created a situation where we needed a completely different backend, but the front-end was shared between them.  The approach we opted for was using git subtrees to split the ui code into its own repository and shared between the nodejs repo and the Java repo.  We did this by using the subtree features in git.

To be clear, I would only use this for very specific situations like this.  If possible, keeping things simple in a single repository is usually best.  But if you’re in the same situation, hopefully this will be helpful for you.

Splitting the Original Repository

The subtree commands effectively take a folder and split to another repository.  Everything you want in the subtree repo will need to be in the same folder. For the sake of this example, let’s assume you have a /lib folder that you want to extract to a separate repo.

Create a new folder and initialize a bare git repo:

mkdir lib-repo
cd lib-repo
git init --bare

Create a remote repository in github or wherever for lib project and add that as the origin remote.

From within your parent project folder, use the subtree split command and put the lib folder in a separate branch:

git subtree split --prefix=lib -b split

Push the contents to the of the split branch to your newly created bare repo using the file path to the repository.

git push ~/lib-repo split:master

This will push the split branch to your new repo as the master branch

From lib-repo push to your origin remote

Now that lib folder lives in it’s new repository, you need to remove it from the parent repository and add the subtree back, from it’s new repository:

git remote add lib <url_to_lib_remote>
git rm -r lib
git add -A
git commit -am "removing lib folder"
git subtree add --prefix=lib lib master

 

Setting up a new user with the subtree

When a new user wants to work on your repository, they will need to setup the subtree repo manually.  What ends up happening is that the split off folder will live in two repositories:  the existing repo and the one setup as a subtree.  You need to explicitly commit changes to subtree.  This is obviously a mixed blessing.  If you have a repository with a few occasional committers, they can pull the original repository and push as if the subtree didn’t exist.  Then some one on the core team could occasionally push to the subtree.

If you want set up a core member who pushes to the subtree, clone the repository as normal:

 git clone <core_git_location>

You will also need to add a second remote repository that points to the rain-ui repository

git remote add lib <lib_git_location>

Once the repository is cloned, you need to remove the lib folder and commit the changes:

git rm -r lib
git add -A git commit -am "removing lib folder and contents"

Now you need to add the lib folder back, but this time using the subtree commands and the rain-ui repo

git subtree add --prefix=lib lib master

Breakdown: prefix defines the folder, lib is the name of the remote repository for the lib project, master is the branch you are pulling from the lib remote

Pushing to the lib repository

If all you are doing is working on non-lib related items you are done, continue pushing to the main repository as necessary.  If you have changes in a different repository that is using the lib repo as a subtree and you  want to push changes upstream, use the following command:

git subtree push --prefix=lib <lib remote name> <branch name>
#following the example 
git subtree push --prefix=lib lib master

Pulling from the lib repository

If there are changes in the lib repository and you are not working in the main  repository, you would use the corresponding subtree pull command:

git subtree pull --prefix=lib <lib remote name> <branch name>
#following the example 
git subtree pull --prefix=lib lib master

References

Here is the list of reference I used during the process. When doing additional research on this topic, be aware of a different strategy called subtree merging, which is a different approach.

https://github.com/apenwarr/git-subtree/blob/master/git-subtree.txt

http://blogs.atlassian.com/2013/05/alternatives-to-git-submodule-git-subtree/

http://makingsoftware.wordpress.com/2013/02/16/using-git-subtrees-for-repository-separation/

The Open Space Experience