Converting Large Subversion Repositories to Slimmed Down Git Repositories

gitrepositorysvn

I have the pleasure of taking over 14 year old subversion repository that consists of two key elements:

  1. 111,000 revisions, about 10% of which are substantial;
  2. The repository dump is about 73 GB, due to a large number of binary data files that have been updated, added and modified over the years.

Here's what I'd like to do, but I'm not sure if it's possible: I'd like to strip the history of the binary files and only keeping the code changes. Then convert that to git. What are your recommendations?

Best Answer

not sure if it's possible: I'd like to strip the history of the binary files, only keeping the code changes. Then convert that to git.

I did not try this not by myself, but I am pretty sure this is possible when you approach the task exactly in the order you described above:

  1. strip the history of the binary files, only keeping the code changes, in SVN first

  2. migrate to Git afterwards.

Step 1 can be accomplished by moving the binary files with their history into some temporary folder (inside the repo, for example with svn move). Then you create a fresh copy of them in your projects's local working directory and check them in as if they were new files - so those new files have no history. Then you use the procedure described in this server fault question to get rid of the temporary folder (utilizing svnadmin dump, svndumpfilter, svnadmin load), which deletes the full history.

Note this way whenever you will check out an older revision of the project, the binary files will be missing completely. To avoid this becoming a problem, consider the strategy of keeping the old SVN repo online, as suggested by @RobertHarvey.

Related Topic