I just want to verify that this error is the result of normal behavior when a host goes down in a cluster. We are migrating to vCenter 5.1 (from version 4.1) and are testing three ESXi 4.1 servers in a cluster in vCenter 5.1. Part of this was to test HA.
I connected to one of the ESXi hosts via ILO and restarted the host directly from the console. When the server went down for a reboot, the VMs on that server failed over to the other ESXi hosts in the cluster as expected. A few minutes later, while the one host is still down and in the process of rebooting, we receive an alert on the first node in the cluster (shows HA state of "Connected (Slave)") stating:
"The vSphere HA agent on this host cannot reach some of the management network addresses of other hosts, and HA may not be able to restart VMs if a host failure occurs: {servername / IP]" (the server name and IP being that of the server that we rebooted from the ESXi console)
The second host in our cluster shows the HA status as "Connected (Master)" and this alert does not show for this server.
Once the third node is back up from the reboot, the error goes away. Is this message normal when a server in a cluster goes down in vCenter 5.1?
Thanks