Linux – the fastest way to “clone” a file in Linux

I would like to use an application API that is not "crash safe"; in other words, there is a high likelihood of the data file being corrupt and unreadable if the application crashes.

The file itself is a "metadata file" and should not get very big: few 100s of MB maximum.

What I want to do is:

Force the application to access the file in "direct mode" (no OS caching).
Pause updates at regular "checkpoint" intervals
Perform a flush() (some data probably got flushed automatically)
Now that I know the file is consistent, clone it.
If there is an "old clone" delete it.
Resume doing changes to the original file.
Loop.

Could I use a special-purpose file system that makes some kind of "zero copy" of the file, combined with copy-on-write of the modified sectors of the original file, to get the clone "almost free" (with minimum disk IO)?

Also, can I do the "clone" without having to fork a process? (I don't know if the Linux file API offers a "cp" system-call).

Best Answer

You could use LVM snapshotting for this instead of cloning. If something goes wrong, just copy the file from the clone.

There is a libdevmapper/libdevmapper-event-lvm2snapshot which could be helpful in doing this programmatically (without a fork): http://sourceware.org/dm/

Edit:

If you can change your program here is another solution: https://stackoverflow.com/questions/1565177/can-i-do-a-copy-on-write-memcpy-in-linux

mmap() the file twice, once normally and once with MAP_PRIVATE.

This would avoid the externalities (esp performance) of lvm

Best Answer

Related Solutions

Linux – What Do the Colors in htop Status Bars Mean

Linux – DF Not Showing Correct Free Space After File Removal

Related Topic