Mercurial Subrepos – How to create them and how do they work

mercurialsubrepostortoisehgversion control

Situation

I have two .NET solutions (Foo and Bar) and a common library that contains ProjectA, ProjectB, and ProjectC. Foo and Bar reference one or more library projects, but the library projects are not located within the Foo and Bar Solution folders.

Directory structure:

-- My Documents*
   -- Development
      -- Libraries
         -- ProjectA
         -- ProjectB
         -- ProjectC
   -- Projects
      -- Foo
         -- Solution
            -- .hg
            -- .hgignore
            -- Foo { Project Folder }
            -- FooTests { Project Folder }
            -- Foo.sln { References ProjectA }
            -- Foo.suo
      -- Bar
         -- Solution
            -- .hg
            -- .hgignore
            -- Bar { Project Folder }
            -- BarTests { Project Folder }
            -- Bar.sln { References ProjectA and ProjectB }
            -- Bar.suo

*alas, I'm still using Windows XP…

Mercurial Subrepositories

Goal – I want to set up subrepos so that I can store the source code for any referenced library projects in my Foo and Bar repositories.

According to this page (which is literally the only documentation I can find on subrepos), setting up a subrepo requires executing the following commands from a DOS console window:

1| $ hg init main
2| $ cd main
3| $ hg init nested
4| $ echo test > nested/foo
5| $ hg -R nested add nested/foo
6| $ echo nested = nested > .hgsub
7| $ hg add .hgsub
8| $ ci -m "initial commit"

Questions

Can any or all of these steps be executed with TortoiseHG, as of version 0.9.2? If yes, how? I'm pretty sure lines 1-3 can, but I don't know about lines 4-7. None of this seems to be documented in TortoiseHG.
What does the above code do (a line-by-line explanation would be much appreciated). Here are some specific questions that came to mind as I was trying to decipher it:
- What does > do? I tried searching through the Mercurial docs for >, but didn't find anything.
- In line 5, I don't understand what nested/foo is. Where did foo come from? What is foo? A repository? A folder?
- Line 6 – this one completely baffles me.
- In line 7, I assume .hgsub is being added to main? Or is it being added to nested?
Let's say I get my subrepos set up, and my Bar repository is now up to revision 10. If I attempt to update my working directory to revision 7, will this cause my library folders (My Documents/Development/Libraries/ProjectA and .../Libraries/ProjectB) to update to whatever is stored in revision 7 as well?

Update

I added an 8th line of code: ci -m "initial commit". This does two things: (1) adds a .hgsubstate file to the main repo and (2) commits all changes, including the new subrepo into the main repository (with message "initial commit"). The purpose of the .hgsubstate file is to keep track of the state of all subrepos, so if you return to an earlier revision, it will grab the correct revision from all subrepos as well.

Update 2 – some instructions

After further experimentation, I think I can now provide the steps to solve my original problem (using mostly Windows Explorer and TortoiseHG):

Creating a subrepo

Libraries/ProjectA, Libraries/ProjectB, and the main repositories (Projects/Foo/Solution and Projects/Bar/Solution) must be separate repositories.
Open Projects/Foo/Solution.
Clone from Libraries/ProjectA to Projects/Foo/Solution.
Add ProjectA to the Foo repository.
Use a text editor to create a file called .hgsub, containing the following:
```
ProjectA = ProjectA
```
Open a DOS console window and enter the following commands (see note below):
```
cd c:\...\Projects\Foo\Solution
hg ci -m "Committing subrepo "ProjectA"
```
For Bar, the steps are basically the same, except the .hgsub file should contain entries for both projects, like this:
```
ProjectA = ProjectA  
ProjectB = ProjectB
```

Note: starting with TortoiseHG 0.10 (which is slated for March), you will be able to use the HG Commit shell command to do this, but for now, you have to use the command line.

Once this is all set up, it gets a little easier.

Committing changes – to commit changes to Foo or Bar, you do a Synchronize/Pull operation for each subrepo to get the subrepos in sync with the latest revisions in the library project repositories. Then you again use the command line to commit the changes (until version 0.10, when you can just use TortoiseHG to commit).

Updating working directory to an earlier revision – This seems to work pretty normally with TortoiseHG and doesn't seem to require use of any DOS commands. To actually work with the earlier revision in Visual Studio, you will need to do a Synchronize/Push operation to put the older version of the library projects back into the Libraries/ProjectX folder.

As much as I like TortoiseHG for simple tasks, it's probably better to write batch files for frequently used subrepo operations (especially updating).

Hope this helps someone in the future. If you see any mistakes, please let me know (or feel free to edit yourself if you are able).

Best Answer

You could probably try this stuff out and learn it more quickly than writing up your question took, but I'll bite.

Can any or all of these steps be executed with TortoiseHG, as of version 0.9.2? If yes, how?

TortiseHG doesn't yet put GUI wrappers around sub-repo creation, but TortiseHG has always done a great job of working with the command line. Use the command line to create and them and you're good to go.

What does the above code do (a line-by-line explanation would be much appreciated).

hg init main  # creates the main repo
cd main # enter the main repo
hg init nested # create the nested. internal repo
echo test > nested/foo # put the word test into the file foo in the nested repo
hg -R nested add nested/foo # do an add in the nested repo of file foo
echo nested = nested > .hgsub # put the string "nested = nested" into a file (in main) named .hgsub
hg add .hgsub # add the file .hgsub into the main repo

Here are some specific questions that came to mind as I was trying to decipher it: What does > do?

That has nothing to do with mercurial it's standard shell (unix and dos) for "put the result into a file named X"

In line 5, I don't understand what nested/foo is. Where did foo come from? What is foo? A repository? A folder?

It's a file in the subrepo. Foo is a traditional arbitrary name, and the arbitrary contents are the string "test"

Line 6 - this one completely baffles me.

It's putting the contents in .hgsub necessary to say that nested is a nested repo named nested and located at nested.

In line 7, I assume .hgsub is being added to main? Or is it being added to nested?

main

Let's say I get my subrepos set up, and my Bar repository is now up to revision 10. If I attempt to update to revision 7, will this cause my library folders (My Documents/Development/Libraries/ProjectA and .../Libraries/ProjectB) to update to whatever is stored in revision 7 as well? Given that Foo also refers to Libraries/ProjectA, this could get interesting!

Revision numbers won't carry across, but you have control by editing the .hgsubstate file.

Using Interactive Rebase

You could do

git rebase -i -p <some HEAD before all of your bad commits>

Then mark all of your bad commits as "edit" in the rebase file. If you also want to change your first commit, you have to manually add it as the first line in the rebase file (follow the format of the other lines). Then, when git asks you to amend each commit, do

 git commit --amend --author "New Author Name <email@address.com>"

edit or just close the editor that opens, and then do

git rebase --continue

to continue the rebase.

You could skip opening the editor altogether here by appending --no-edit so that the command will be:

git commit --amend --author "New Author Name <email@address.com>" --no-edit && \
git rebase --continue

Single Commit

As some of the commenters have noted, if you just want to change the most recent commit, the rebase command is not necessary. Just do

 git commit --amend --author "New Author Name <email@address.com>"

This will change the author to the name specified, but the committer will be set to your configured user in git config user.name and git config user.email. If you want to set the committer to something you specify, this will set both the author and the committer:

 git -c user.name="New Author Name" -c user.email=email@address.com commit --amend --reset-author

Note on Merge Commits

There was a slight flaw in my original response. If there are any merge commits between the current HEAD and your <some HEAD before all your bad commits>, then git rebase will flatten them (and by the way, if you use GitHub pull requests, there are going to be a ton of merge commits in your history). This can very often lead to a very different history (as duplicate changes may be "rebased out"), and in the worst case, it can lead to git rebase asking you to resolve difficult merge conflicts (which were likely already resolved in the merge commits). The solution is to use the -p flag to git rebase, which will preserve the merge structure of your history. The manpage for git rebase warns that using -p and -i can lead to issues, but in the BUGS section it says "Editing commits and rewording their commit messages should work fine."

I've added -p to the above command. For the case where you're just changing the most recent commit, this is not an issue.

Update for modern git clients (July 2020)

Use --rebase-merges instead of -p (-p is deprecated and has serious issues).

Best Answer

Related Solutions

Git – the Difference Between Mercurial and Git

Git – How to change the author and committer name and e-mail of multiple commits in Git

Using Interactive Rebase

Single Commit

Note on Merge Commits

Update for modern git clients (July 2020)

Related Topic