Linux – Volume group disappeared, LVs still available

disk-volumekvm-virtualizationlinuxlvm

I've run into an issue with my KVM host which runs VMs on a LVM volume. As of last night the logical volumes are no longer seen as such (I can't create snapshots of them even though I have been for months now).

Running any scans all result in nothing being found:

[root@apollo ~]# pvscan
No matching physical volumes found

[root@apollo ~]# vgscan
Reading all physical volumes.  This may take a while...
No volume groups found

root@apollo ~]# lvscan
No volume groups found

If I try restoring the VG conf backup from /etc/lvm/backups/vg0 I get the following error:

[root@apollo ~]# vgcfgrestore -f /etc/lvm/backup/vg0 vg0
Couldn't find device with uuid 20zG25-H8MU-UQPf-u0hD-NftW-ngsC-mG63dt.
Cannot restore Volume Group vg0 with 1 PVs marked as missing.
Restore failed.

/etc/lvm/backups/vg0 has the following for the physical volume:

physical_volumes {

            pv0 {
                    id = "20zG25-H8MU-UQPf-u0hD-NftW-ngsC-mG63dt"
                    device = "/dev/sda5"    # Hint only

                    status = ["ALLOCATABLE"]
                    flags = []
                    dev_size = 4292870143   # 1.99902 Terabytes
                    pe_start = 384
                    pe_count = 524031       # 1.99902 Terabytes
            }
}

fdisk -l /dev/sda shows the following:

[root@apollo ~]# fdisk -l /dev/sda

Disk /dev/sda: 6000.1 GB, 6000069312512 bytes
64 heads, 32 sectors/track, 5722112 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000188b7

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               2       32768    33553408   82  Linux swap / Solaris
/dev/sda2           32769       33280      524288   83  Linux
/dev/sda3           33281     1081856  1073741824   83  Linux
/dev/sda4         1081857     3177984  2146435072   85  Linux extended
/dev/sda5         1081857     3177984  2146435071+  8e  Linux LVM

The server is running a 4 disk HW RAID10 which seems perfectly healthy according to megacli and smartd.

The only odd message in /var/log/messages is the following which shows up every couple of hours:

Jun 10 09:41:57 apollo udevd[527]: failed to create queue file: No space left on device

Output of df -h

[root@apollo ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3            1016G  119G  847G  13% /
/dev/sda2             508M   67M  416M  14% /boot

Does anyone have any ideas what to do next? The VMs are all running fine at the moment apart from not being able to snapshot them.

Updated with extra info
It's not a lack of inodes:

[root@apollo ~]# df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/sda3            67108864   48066 67060798    1% /
/dev/sda2              32768      47   32721    1% /boot

pvs, vgs & lvs either output nothing or "No volume groups found".

Best Answer

I think somehow udev stopped working so you do not have access to the low-level commands.

You can try:

pvs
vgs
lvs

commands to check your running lvm configuration.

You can try restarting udev (or rebooting the server as a last resort).

Just out of curiosity what does df -i says?

Related Solutions

Lvm – How to resize a regular (non-LVM) partition

In theory, you could reduce the size of sda1, increase the size of the extended partition, shift the contents of the extended partition down, then increase the size of the PV on the extended partition and you'd have the extra room. However, the number of possible things that can go wrong there is just astronomical, so I'd recommend either buying a second hard drive (and possibly transferring everything onto it in a more sensible layout, then repartitioning your current drive better) or just making some bind mounts of various bits and pieces out of /home into / to free up a bit more space.

Linux – Rescue disk is unable to see the lvm physical volumes

vgscan -vvvv

will give you very extensive output about why vgscan considers any specific volume being part of a volume group. You also could run pvs -a to see a summary of your physical volumes alongside with volume group assignments.

vgscan -vvvv output for one of the partitions:

Opened /dev/sdc3 RO /dev/sdc3: size is 3772817055 sectors 
Closed /dev/sdc3 /dev/sdc3: size is 3772817055 sectors 
Opened /dev/sdc3 RO O_DIRECT /dev/sdc3: block size is 512 bytes 
Closed /dev/sdc3 Using /dev/sdc3 
Opened /dev/sdc3 RO O_DIRECT /dev/sdc3: block size is 512 bytes 
/dev/sdc3: No label detected
Closed /dev/sdc3

pvs -a didn't reveal anything. All physical volumes listed without a volume group assignment

"no label detected" sounds pretty sad. You are sure that it is a LVM2 partition and not, say, a partition used by md-raid? You could check using mdadm --examine /dev/sdc3. And please post fdisk -l /dev/sdc

Yes I'm sure it is an LVM2 partition. The mdadm command gives "No md superblock detectd on /dev/sdc3" The fdisk says /dev/sdc3 is a Linux LVM partition.

Ah, then you will be in the lucky position (irony here, sorry) to try LVM recovery due to presumably damaged data structures. There is a howto about LVM recovery which might give you a starting point - try loading your VG configuration either from the disk itself (using dd if=/dev/sdc3 bs=512 count=255 skip=1) or from the /etc/lvm/backup folder of your former root filesystem (which I understand is on /dev/sdc1) into /etc/lvm/backup/ and re-issuing the vgscan command.

I tried that on both sda3 and sdc3 (as you can see, I have 3 lvm partitions to do this too) and they all result in binary files in the output text file. Ok, correction. There is some lvm meta data in the file, but it's several bytes into the file. I'm looking through the data, but it looks correct. I will keep trying to go through that restore process.

This is expected - both, VG and LV configs is cleartext within binary structures.

I ended up using a slightly modified process than what was outlined here. I ended up making a cfgbackup file from the byte data in the LVM, do a pvcreate, then a vgcfgrestore. After that, it worked. Thanks for the help.

Best Answer

Related Solutions

Lvm – How to resize a regular (non-LVM) partition

Linux – Rescue disk is unable to see the lvm physical volumes

Related Topic