The duplicate UUIDs are normal.
Here's what mine look like:
$ stat /dev/disk/by-uuid/* | grep md
File: `/dev/disk/by-uuid/4047dc03-xxxx-xxxx-xxxx-xxxxxxxxxxxx' -> `../../md1'
File: `/dev/disk/by-uuid/78aeced1-xxxx-xxxx-xxxx-xxxxxxxxxxxx' -> `../../md0'
File: `/dev/disk/by-uuid/aec72c9f-xxxx-xxxx-xxxx-xxxxxxxxxxxx' -> `../../md2'
mdadm
gives:
$ sudo mdadm -D /dev/md{0,1,2} | grep UUID
UUID : cb706582:xxxxxxxx:xxxxxxxx:xxxxxxxx
UUID : 4033316c:xxxxxxxx:xxxxxxxx:xxxxxxxx
UUID : e7ae2c88:xxxxxxxx:xxxxxxxx:xxxxxxxx
mdadm
and vol_id
give the same UIDs for the partitions (vol_id
output omitted for brevity, but I haven't tried pulling the disks and checking while not in an array):
$ sudo mdadm -E /dev/sd{a,b,c,d}{1,2} 2> /dev/null | grep UUID
UUID : cb706582:xxxxxxxx:xxxxxxxx:xxxxxxxx
UUID : 4033316c:xxxxxxxx:xxxxxxxx:xxxxxxxx
UUID : e7ae2c88:xxxxxxxx:xxxxxxxx:xxxxxxxx
UUID : 4033316c:xxxxxxxx:xxxxxxxx:xxxxxxxx
UUID : e7ae2c88:xxxxxxxx:xxxxxxxx:xxxxxxxx
UUID : cb706582:xxxxxxxx:xxxxxxxx:xxxxxxxx
blkid
gives me a different set of UUIDs, but they are still duplicated across mirrored partitions:
$ sudo blkid /dev/sd{a,b,c,d}{1,2} 2> /dev/null
/dev/sda1: UUID="826570cb-xxxx-xxxx-xxxx-xxxxxxxxxxxx" TYPE="mdraid"
/dev/sdb1: UUID="6c313340-xxxx-xxxx-xxxx-xxxxxxxxxxxx" TYPE="mdraid"
/dev/sdb2: UUID="882caee7-xxxx-xxxx-xxxx-xxxxxxxxxxxx" TYPE="mdraid"
/dev/sdc1: UUID="6c313340-xxxx-xxxx-xxxx-xxxxxxxxxxxx" TYPE="mdraid"
/dev/sdc2: UUID="882caee7-xxxx-xxxx-xxxx-xxxxxxxxxxxx" TYPE="mdraid"
/dev/sdd1: UUID="826570cb-xxxx-xxxx-xxxx-xxxxxxxxxxxx" TYPE="mdraid"
Assuming you've tested the disks with hdparm
, smartctl
, or some other tool, feel free to blame the next cheapest component to replace (assuming the disks were the cheapest).
So this was a combination of a bad stick of RAM and a Linux kernel bug affecting SATA. I'd put Ubuntu 10.04 on there, and eventually left memtest86+ running all night (as running it for 1.5 passes before hadn't flushed out the problem).
After I removed the bad RAM, I started seeing SATA errors in /var/syslog, similar to this:
Dec 8 14:56:17 george kernel: [ 36.442340] ata4.00: exception Emask 0x10 SAct 0x4 SErr 0x4010000 action 0xe frozen
Dec 8 14:56:17 george kernel: [ 36.442355] ata4.00: irq_stat 0x00400040, connection status changed
Dec 8 14:56:17 george kernel: [ 36.442366] ata4: SError: { PHYRdyChg DevExch }
Dec 8 14:56:17 george kernel: [ 36.442375] ata4.00: failed command: READ FPDMA QUEUED
Dec 8 14:56:17 george kernel: [ 36.442388] ata4.00: cmd 60/08:10:88:a9:87/00:00:1b:00:00/40 tag 2 ncq 4096 in
Dec 8 14:56:17 george kernel: [ 36.442389] res 40/00:64:30:aa:8b/00:00:12:00:00/40 Emask 0x10 (ATA bus error)
Dec 8 14:56:17 george kernel: [ 36.442408] ata4.00: status: { DRDY }
Dec 8 14:56:17 george kernel: [ 36.442418] ata4: hard resetting link
Dec 8 14:56:23 george kernel: [ 41.724689] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Dec 8 14:56:24 george kernel: [ 42.445422] ata4.00: configured for UDMA/133
Dec 8 14:56:24 george kernel: [ 42.445432] ata4: EH complete
I finally discovered this bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/285892?comments=all which led me to try an earlier Linux kernel (the one that ships with Ubuntu 8.04). The machine's been working great ever since.
Best Answer
You won't be able to find this information from within the VM. Are you trying to do this programmatically? If you can deal with a couple of manual steps, you can look at the virtual machine's logs in vSphere under
Tasks and Events
to see the VM's history.You may also be able to look at the source templates' histories and glean the same information.