Wednesday, July 9, 2008

VMware ESX 3.5, MSCS and iSCSI

Trying to setup Microsoft Clustering Services on VMware ESX 3.5 with iSCSI turned out to be a real pain in the *ss.

PDF from VMware (http://www.vmware.com/pdf/vi3_35/esx_3/vi3_35_25_u1_mscs.pdf) indicates the "proper" and supported way of doing it. However, I don't have Fiber channel and if you follow these instructions, you can't VMotion any machine that has Clustered disk (unless you have all VM's on the same ESX server, but then what is the point of the MSCS?)

Anyway, I wanted a solution that would allow me to have clustering, but yet still get the benefit of VMotion.

So...my solution was to use the Microsoft iSCSI Initiator inside the Windows VM's and connect to iSCSI LUNs directly from the VM's.

First I setup the iSCSI SAN. For this I used the newest Openfiler 2.3 as the iSCSI target. I set it up to boot from a pen drive (USB memory stick) following some instructions here: http://ha.nnes.be/index.php/install-openfiler-on-usb-stick/
Mind you....I didn't do the chroot and I had created a separate /boot and / directories, so my install was slightly different. Anyway....got it all working.

Next....setup Windows 2003 VM's with Clustering and install the Microsoft iSCSI initator.

Tired now....gonna post the rest later....


Edit: August 14th, 2008

Found out that USB drives are somewhat flakey for this purpose. Plus they have a limited write cycle. It can be from 100,000 to 1-5 million writes, but on a system that is supposed to be up 24x7 and be extremely reliable, I don't feel comfortable using a $10 USB stick that could wear out.

So...I have redone this all with Openfiler 2.3, Areca 1230 RAID card and 2 small internal hard drives attached via SATA to the motherboard using Software RAID for the boot drives. I have been running this setup for over a month and done extensive testing on various settings/configs to find the one that worked the best (for me).

I use iSCSI for the clustered disks (with Microsoft iSCSI Initiator) and NFS mounts for all regular disks (from VMware ESX). Suprisingly, the NFS performs pretty much the same as iSCSI plus a) it is readable from the Openfiler directly and b) it doesn't have the 2 TB LUN restriction. Only downside I have seen thus far is that I don't get disk performance stats for the NFS disks from ESX. Not a huge thing.

Two important things to get the best speed from NFS - set async mode on the NFS (this assumes you are using a good UPS) - this will significantly improve your write speed. The second thing is to use alb bonding mode for your two gb ethernet nics. This mode does not have any special switch requirements but allows good speed both receiving and sending.

I will try to document my steps here in a little while.