LVM is an amazingly useful tool, however, it seems to still lack support for snapshotting a drive that is already a snapshot. I made a script to handle this automatically but ran into some troubles.
My test environment
I'm testing the script on a clean Xen-box (from now on:testbox). After creating the box I created a new LVM volume and added it to the testbox's drives. In testbox itself, it is displayed as a normal block device so I don't think the Dom0's LVM should interfere with the testing process.
The original drive
On testbox, i created a new partition using the following commands:
# Using the data in the other tables i determined
# where i could begin my new device sectors
dmsetup table
# Create the new device without a table
dmsetup create base --notable
# Put the table into the device...
echo '0 4194304 linear 202:2 0' | dmsetup load base
dmsetup resume base
mkfs.ext2 /dev/mapper/base
To be clear, the second target argument '202:2' is that second device I added to the testbox machine, I double checked it like so:
ls /dev -l | grep 'xvda2'
Returning:
brw-rw—- 1 root disk 202, 2 May 3 17:01 xvda2
The script
I wrote this function to make a snapshot:
function create_dm_snapshot {
banner "0: Checking if block devices don't allready exist, original device should exist...";
device_exists $base_path$original;
[ $? -eq 0 ] || error 'The source (original) device should exist';
device_exists $base_path$snapshot_origin $base_path$snapshot $base_path$cow;
[ $? -eq 0 ] && error "They allready exist pls use the 'remove' function";
echo "Done checking.";
banner "1: Suspending the original device.";
suspend_dev $original || error "Failed suspending original device";
banner "2: Creating snapshot-origin.";
create_dev $snapshot_origin || error "Failed creating snapshot-origin";
banner "3: Read original table into snapshot-origin.";
dmsetup table $original | dmsetup load $snapshot_origin ||
error 'Failed loading original table into snapshot-origin';
echo "Done reading.";
banner "4: Resume snapshot-origin.";
resume_dev $snapshot_origin || error 'Could not resume snapshot-origin';
banner "5: Create snapshot device.";
create_dev $snapshot || error 'Failed to create snapshot device';
banner "6: Create COW-device.";
#TODO: check total sector count device
create_dev $cow ;
target_device=$( dmsetup table $original | awk '{print $4}' );
last_table=$( dmsetup table | grep "$target_device" | awk '{print $6}' | sort -g | tail -n 1 );
begin_sector_args=( $( dmsetup table | grep -E $target_device".*"$last_table"|"$last_table".*"$target_device | awk '{print $2 " " $3 " " $6}' ) );
begin_sector=$( expr ${begin_sector_args[1]} - ${begin_sector_args[0]} + ${begin_sector_args[2]} );
table="0 $size linear $target_device $begin_sector";
echo $table | dmsetup load $cow;
resume_dev $cow;
banner "7: Calculate rowcount in snapshot-origin";
snapshot_origin_size=$( blockdev --getsz $base_path$snapshot_origin ) ||
error 'Could not determine rowcount';
echo "Snapshot size: $snapshot_origin_size";
banner "8: Load snapshot table.";
table="0 $snapshot_origin_size snapshot $base_path$snapshot_origin $base_path$cow p 64";
[ $verbose ] && echo "Table: $table";
echo $table | dmsetup load $snapshot || error 'Failed loading snapshot table';
echo "Done loading.";
banner "9: Reload original device table.";
table="0 $snapshot_origin_size snapshot-origin $base_path$snapshot_origin";
[ $verbose ] && echo "Table: $table";
echo $table | dmsetup load $original || error 'Failed reloading original table';
echo "Done reloading.";
banner "10: Resume frozen tables.";
resume_dev $snapshot $original || error 'Could not resume devices';
echo "Done resuming.";
}
The error
At step 8 (banner "8: …) the script fails with the following error:
device-mapper: reload ioctl failed: No such device or address
Command failed
,
dmsetup table
Results in the following table data:
dm.base.snapshot_origin: 0 4194304 linear 202:2 0
base: 0 4194304 linear 202:2 0
dm.base.snapshot:
dm.base.cow: 0 4096 linear 202:2 4194304
As I wasn't able to determine the cause of the error the last step I did was look into my dmesg…
dmesg | tail
Giving me:
PM: freeze of devices complete after 0.080 msecs
suspending xenstore…
PM: late freeze of devices complete after 0.019 msecs
PM: early restore of devices complete after 0.035 msecs
PM: restore of devices complete after 32.367 msecs
Setting capacity to 10485760
Setting capacity to 104857600
device-mapper: persistent snapshot: Invalid or corrupt snapshot
device-mapper: table: 254:2: snapshot: Failed to read snapshot metadata
device-mapper: ioctl: error adding target to table
I wasn't able to find out what caused the snapshot to be corrupt.
Best Answer
That does not guarantee the exported disk(s) to contain only zeroes in their unwritten areas. Thus the kernel may detect something which isn't really there. You should overwrite the first part of the COW volume (I don't know how much is needed but the first 4 MiB should be enough. Oh, your COW volume isn't even 4 MiB in size:
Maybe there is a minimun size for COW volumes and yours is simply too small?