What are the best steps to convert multi-repos to a mono-repo?
This is what I have so far:
- for each repo, check out the most recent branch (integration branch, usually)
- for each repo, copy the repo folder to the new git repo (the mono repo)
- for each folder, delete the old .git folder
- Stage all the files, commit, and push the new mono-repo
My only question at the moment – will the existing .gitignore files work properly for the subfolders in the new mono-repo?
Best Answer
Don't copy the files, merge the repositories instead. Git doesn't make a big difference between “different repository” and “different branch”. More precisely, a repository is a collection of tags and branches. I'll assume that you want to merge the master branch of all repos.
General approach (but see discussion of git-subtree below):
Think about the layout of your monorepo. I'll assume that for the start, you'll have each current repository as a sub-folder of the monorepo in order to avoid conflicts.
For each current repository, move the repository contents into a subfolder and commit the change. You can use the
git mv
command to do this easily.E.g. if your component is called
libfoo
and you currently have this repository layout:Then we might move it into a
libfoo/
folder:Create a new repository for your monorepo, and add all existing repos as a “remote”. Despite its name, a remote repository can be a path to some directory on the same file system. Then
git fetch --all
remotes to load their history into the monorepo's git database. Afterwards, you can list all branches withgit branch --all
. This will look like:For each remote, merge it's master branch. There should be no conflicts because everything is in a separate directory.
Now you're done, and you have a monorepo without loss of history. You can remove the remotes.
But careful: you can only merge one branch of each repo. If one of the original repositories has multiple branches, they can no longer be merged without excessive conflicts. Consider rebasing them after you move the repository contents into one folder, but before merging everything into the monorepo.
In practice, you can use
git subtree
to automate most of these steps. The subtree command allows you to merge a specific branch into a specific directory.for each existing repo, add it as a subtree, e.g.:
The
-P
/--prefix
is the directory under which the repository contents should be added. In place of the path to a repo, any repository URL can be used. By default this will add the complete history, alternatively you can--squash
the history into a single commit.Git-subtree is an extremely powerful tool for manipulating monorepos. You can also extract a directory into a separate repository (
git subtree push
) or merge updates from the original repo (git subtree pull
). For example, you might use this to translate different branches.feature
branch:git checkout -b feature
feature
into the correct directory of the monorepo:git subtree pull -P libfoo/ ../path/to/libfoorepo feature
.But consider whether a monorepo is really appropriate for your use case. It may still be desirable to have different repos available independently. The main contender is git submodules, where one repository is mounted as a sub-directory of another. However, the experience is not seamless. The branch history is not shared with the submodule. If you edit code in a submodule you have to commit that work separately. Git submodules are most useful for “vendoring“ external dependencies that are pinned to a specific version, not for combined development.
Whatever approach you use, gitignore files will continue to work because any patterns are matched relative to the gitignore file. A repository can contain multiple gitignore files.