vSphere multi-pathing failover
Whilst researching some storage options for a client, I stumbled across this pretty useful information
1) How often ESX checks for path failure?
==> As soon as an I/O request to a path fails, ESX will initiate a path failover. If there is no I/O outstanding to a path. ESX will probe each physical path every 5 minutes, by default, to proactively detect path failure.
2) How long will ESX wait before trying a different storage path?
==> A different storage path is tried immediately.
3) What happens between the failure detection and the connection recovery?
==> I/O requests will be queued.
4) Under which circumstances will a host initiated target reset or LUN reset occur?
==> Resets are not typically initiated by ESX. Two exceptions to this rule are:
- if a path failure occurs while there is a SCSI-2 reservation outstanding on the failed path
- if the VM or userworld that initiated the I/O request sends a request to abort the outstanding I/O request.
5) Will ESX ever force a LUN trespass in the array?
==> Only in the case of an A/P array when there is no working path on the same SP as the failing path. For example, when using an EMC Clariion (which uses the “trespass” command).
Technically point two needs some work.
ESX SCSI layer has a latency between when the failure occurs and when it will try a different path. This is usually determined by when the device driver returns a failure ie. I/O errors. Supported fibre channel devices will do this inside of thirty seconds. The ESX SCSI layer then has thirty seconds to use another working path. On an active/passive array the activate/trespass/etc command can take a few seconds to complete.
Within a virtual machine, the registry ([HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Disk]
“TimeoutValue”=dword:000000be) can be configured to wait before issuing an abort for an outstanding I/O. If this key is enabled
ESX states that within this 60 seconds it will:
- detect that a path has failed
- select a new path
- activate the new path
- re-issue the cmd from the guest to the new path
- have the newly issued cmd complete successfully and return to the guest.