I've got a Lenny server that has got a SAN connection configured as the only PV for a VG named 'datavg'.
Yesterday, I've updated the box with Debian patches and gave it a reboot.
After the reboot, it didn't boot up saying that it couldnt find /dev/mapper/datavg-datalv.
This is what I did:
– booted in rescue-mode and commented the mount in /etc/fstab
– rebooted into full-user mode. (mountpoint is /data, only postgresql could not start)
– did vgdisplay, lvdisplay, pvdisplay to find out what happened to the volume group. (datavg was missing entirely)
After that, I noticed that the LUN is visible from Linux and that the LVM partition is also visible:
# ls -la /dev/mapper/mpath0*
brw-rw---- 1 root disk 254, 6 2009-11-23 15:48 /dev/mapper/mpath0
brw-rw---- 1 root disk 254, 7 2009-11-23 15:48 /dev/mapper/mpath0-part1
– Then, I tried pvscan in order to find out if it could find the PV. Unfortunately, it didnt detect the partition as a PV.
– I ran pvck on the partition, but it did not find any label:
# pvck /dev/mapper/mpath0-part1
Could not find LVM label on /dev/mapper/mpath0-part1
– Then, I was wondering if the LUN was perhaps empty, so I made a dd of the first few MB. In this, I could see the LVM headers:
datavg {
id = "removed-hwEK-Pt9k-Kw4F7e"
seqno = 2
status = ["RESIZEABLE", "READ", "WRITE"]
extent_size = 8192
max_lv = 0
max_pv = 0
physical_volumes {
pv0 {
id = "removed-AfF1-2hHn-TslAdx"
device = "/dev/dm-7"
status = ["ALLOCATABLE"]
dev_size = 209712382
pe_start = 384
pe_count = 25599
}
}
logical_volumes {
datalv {
id = "removed-yUMd-RIHG-KWMP63"
status = ["READ", "WRITE", "VISIBLE"]
segment_count = 1
segment1 {
start_extent = 0
extent_count = 5120
type = "striped"
stripe_count = 1 # linear
stripes = [
"pv0", 0
]
}
}
}
}
Note that this came from the partition where pvck could not find an LVM label!
– I decided to write a new LVM label to the partition and restore the parameters from the backup file.
pvcreate --uuid removed-AfF1-2hHn-TslAdx --restorefile /etc/lvm/backup/datavg /dev/mapper/mpath0-part1
– Then I ran a vgcfgrestore -f /etc/lvm/backup/datavg datavg
– After that, I appears when I issue a pvscan.
– With a vgchange -ay datavg, I activated the VG and the LV came available.
– When I tried to mount the LV, it did not find any filesystem. I tried recovery in several ways, but did not succeed.
– After making a DD of the affected LV, I've tried to recreate the superblocks with
mkfs.ext3 -S /dev/datavg/backupdatalv
– but the result of this cannot be mounted:
# mount /dev/datavg/backupdatalv /mnt/
mount: Stale NFS file handle
The fact that this can happen in the first place is not very nice to say the least, so I want to find out everything I can about this malfunction.
My questions:
– How can it be that the LVM label disappears after patches and a reboot?
– Why is the filesystem not there after salvaging the PV? (Did the pvcreate command trash the data?)
– Is the ext3 filesystem in the LV still salvageable?
– Is there anything I could have done to prevent this issue?
Thanks in advance,
Ger.
Best Answer
I once ran into a similar problem. In our case, someone created a partition to hold the PV, but when they ran the pvcreate command, they forgot to specify the partition and instead used the whole device. The system ran fine until a reboot, when LVM could no longer find the PV.
So in your case, is it possible that someone ran "pvcreate /dev/mapper/mpath0" at the time of creation rather than "pvcreate /dev/mapper/mpath0-part1"? If so, you'll need to remove the partition table from the disk containing the PV.
From the pvcreate(8) man page to delete a partition table:
The LVM code in the kernel will not recognize a whole device PV if there is a partition table on the device. Once we removed the partition table, the PV was recognized and we could access our data again.