C++ – How to “normalize” a pathname using boost::filesystem

boostcfilesystems

We are using boost::filesystem in our application. I have a 'full' path that is constructed by concatenating several paths together:

#include <boost/filesystem/operations.hpp>
#include <iostream>
     
namespace bf = boost::filesystem;

int main()
{
    bf::path root("c:\\some\\deep\\application\\folder");
    bf::path subdir("..\\configuration\\instance");
    bf::path cfgfile("..\\instance\\myfile.cfg");

    bf::path final ( root / subdir / cfgfile);

    cout << final.file_string();
}

The final path is printed as:

c:\some\deep\application\folder\..\configuration\instance\..\instance\myfile.cfg

This is a valid path, but when I display it to the user I'd prefer it to be normalized. (Note: I'm not even sure if "normalized" is the correct word for this). Like this:

c:\some\deep\application\configuration\instance\myfile.cfg

Earlier versions of Boost had a normalize() function – but it seems to have been deprecated and removed (without any explanation).

Is there a reason I should not use the BOOST_FILESYSTEM_NO_DEPRECATED macro? Is there an alternative way to do this with the Boost Filesystem library? Or should I write code to directly manipulating the path as a string?

Best Answer

Boost v1.48 and above

You can use boost::filesystem::canonical:

path canonical(const path& p, const path& base = current_path());
path canonical(const path& p, system::error_code& ec);
path canonical(const path& p, const path& base, system::error_code& ec);

http://www.boost.org/doc/libs/1_48_0/libs/filesystem/v3/doc/reference.html#canonical

v1.48 and above also provide the boost::filesystem::read_symlink function for resolving symbolic links.

Boost versions prior to v1.48

As mentioned in other answers, you can't normalise because boost::filesystem can't follow symbolic links. However, you can write a function that normalises "as much as possible" (assuming "." and ".." are treated normally) because boost offers the ability to determine whether or not a file is a symbolic link.

That is to say, if the parent of the ".." is a symbolic link then you have to retain it, otherwise it is probably safe to drop it and it's probably always safe to remove ".".

It's similar to manipulating the actual string, but slightly more elegant.

boost::filesystem::path resolve(
    const boost::filesystem::path& p,
    const boost::filesystem::path& base = boost::filesystem::current_path())
{
    boost::filesystem::path abs_p = boost::filesystem::absolute(p,base);
    boost::filesystem::path result;
    for(boost::filesystem::path::iterator it=abs_p.begin();
        it!=abs_p.end();
        ++it)
    {
        if(*it == "..")
        {
            // /a/b/.. is not necessarily /a if b is a symbolic link
            if(boost::filesystem::is_symlink(result) )
                result /= *it;
            // /a/b/../.. is not /a/b/.. under most circumstances
            // We can end up with ..s in our result because of symbolic links
            else if(result.filename() == "..")
                result /= *it;
            // Otherwise it should be safe to resolve the parent
            else
                result = result.parent_path();
        }
        else if(*it == ".")
        {
            // Ignore
        }
        else
        {
            // Just cat other path entries
            result /= *it;
        }
    }
    return result;
}