Ny way to lock/unlock a Hosted Engine in oVirt

glusterfskvm-virtualizationlibvirtovirt

I have a weird situation with my Hosted Engine in oVirt.

We have an oVirt cluster set up using gluster as the storage for the engines isos and all the information. About a week ago two of the three servers went down. We restarted the machines about three times and the the gluster hosts reported as connected, and did not report any split-brain errors.

The hosted-engine process tried to bring up the hosted engine on one of the hosts, but it went into an EngineUnexpectedlyDown state. And it would subract 1600 from the servers score and then try to bring it up on the next machine until they all ended up with about 800 score and it would just try to boot on one machine and then just sit there with a "failed to reach vm" message.

We've figured out that the hosted-engine is actually booting as we can connect to it wit ha vnc client. But it seems to be in some sort of locked state. If you log into the hosted engine, even as root no files can be changed and the vm in inaccessible through any other means.

Is there any way to see if the vm is locked/read-only?
And is there anyway to remove said lock?

Best Answer

Assuming you were using replica 3, when 2 hosts go down the file system becomes read-only and this could explain what you currently see to some degree. In most cases we should expect the VM to freeze as qemu cannot write to the storage, but I need some more information about it so log files from the hosts are needed here.

First of all let's see that the status can be read from the storage, and you can achieve it by running the following from one of the hosts:

hosted-engine --vm-status

Assuming it will work, try to move the host to global maintenance:

hosted-engine --set-maintenance --mode=global

If this does not work, it means there are issues with accessing the meta data file in the storage, and potentially that it is read only.

If that works the VM will be in maintenance mode which allows you to check inside the VM what's the status in terms of files, and reboot the VM if needed on the same host.