Linux – Volume group disappeared, LVs still available

disk-volumekvm-virtualizationlinuxlvm

I've run into an issue with my KVM host which runs VMs on a LVM volume. As of last night the logical volumes are no longer seen as such (I can't create snapshots of them even though I have been for months now).

Running any scans all result in nothing being found:

[root@apollo ~]# pvscan
No matching physical volumes found

[root@apollo ~]# vgscan
Reading all physical volumes.  This may take a while...
No volume groups found

root@apollo ~]# lvscan
No volume groups found

If I try restoring the VG conf backup from /etc/lvm/backups/vg0 I get the following error:

[root@apollo ~]# vgcfgrestore -f /etc/lvm/backup/vg0 vg0
Couldn't find device with uuid 20zG25-H8MU-UQPf-u0hD-NftW-ngsC-mG63dt.
Cannot restore Volume Group vg0 with 1 PVs marked as missing.
Restore failed.

/etc/lvm/backups/vg0 has the following for the physical volume:

physical_volumes {

            pv0 {
                    id = "20zG25-H8MU-UQPf-u0hD-NftW-ngsC-mG63dt"
                    device = "/dev/sda5"    # Hint only

                    status = ["ALLOCATABLE"]
                    flags = []
                    dev_size = 4292870143   # 1.99902 Terabytes
                    pe_start = 384
                    pe_count = 524031       # 1.99902 Terabytes
            }
}

fdisk -l /dev/sda shows the following:

[root@apollo ~]# fdisk -l /dev/sda

Disk /dev/sda: 6000.1 GB, 6000069312512 bytes
64 heads, 32 sectors/track, 5722112 cylinders
Units = cylinders of 2048 * 512 = 1048576 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000188b7

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               2       32768    33553408   82  Linux swap / Solaris
/dev/sda2           32769       33280      524288   83  Linux
/dev/sda3           33281     1081856  1073741824   83  Linux
/dev/sda4         1081857     3177984  2146435072   85  Linux extended
/dev/sda5         1081857     3177984  2146435071+  8e  Linux LVM

The server is running a 4 disk HW RAID10 which seems perfectly healthy according to megacli and smartd.

The only odd message in /var/log/messages is the following which shows up every couple of hours:

Jun 10 09:41:57 apollo udevd[527]: failed to create queue file: No space left on device

Output of df -h

[root@apollo ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3            1016G  119G  847G  13% /
/dev/sda2             508M   67M  416M  14% /boot

Does anyone have any ideas what to do next? The VMs are all running fine at the moment apart from not being able to snapshot them.

Updated with extra info
It's not a lack of inodes:

[root@apollo ~]# df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/sda3            67108864   48066 67060798    1% /
/dev/sda2              32768      47   32721    1% /boot

pvs, vgs & lvs either output nothing or "No volume groups found".

Best Answer

I think somehow udev stopped working so you do not have access to the low-level commands.

You can try:

pvs
vgs
lvs

commands to check your running lvm configuration.

You can try restarting udev (or rebooting the server as a last resort).

Just out of curiosity what does df -i says?

Related Topic