VMware Fault Tolerance (FT) vs. Microsoft Clustering (MSCS)
Having had much experience with various clustering technologies, I thought I’d pass on some experience with what I have found to be a great solution – that being VMware’s Fault Tolerance solution. This is a review on how FT stacks up against traditional MSCS, whether it be with SQL servers, simple file servers, or any other custom application server that demands high availability. Obviously, for these purposes, we are talking about a Windows-based operating system. VMware FT will work with any and every operating system supported by VMware, whether it be RedHat Linux, Solaris, etc. (and usually even the ones that they don’t). However, this is more of a case study on the pro’s of FT in a MS Server that demands high-availability (HA).
Now, there are all sorts of combinations of MSCS/VMware clustering solutions that can be mixed and matched together, but this is focusing on TRULY Fault-Tolerant solutions. In other words, downtime is ideally measured in milliseconds, not seconds or minutes.
That being said, it’s key to point out that the virtualization tide is essentially a tidal wave by now. I’ve been working with today’s modernized versions of virtualization (mostly with VMware) very early in the game (12/04), and I’ve obviously seen firsthand how quickly the benefits are shown, and how quickly environments become virtualized.
So in a mixed environment, with multiple VMware ESX servers, and some physical Windows SQL/File/App servers that are truly business critical, why use VMware FT? Why not just stick with the old MSCS standby? Well, here are the key reasons:
1.) Ease of configuration. Now, everyone who has setup a MSCS cluster can attest to the fact that it can be very finicky. Also failovers are not always, how shall I say, as quick as they should be. If resource groups don’t have the proper dependencies, if the internal heartbeat isn’t quite working, failover will not work.
2.) Leverages existing hardware. The assumption here is that in a mixed environment, ESX servers are available – and presumably using shared storage, and have HA enabled. So, you aren’t using hardware SOLELY dedicated to clustering – it’s already purchased and in use.
3.) Chance of differing configuration over time is eliminated. This is absolutely impossible with FT. With multiple people possibly having access to a traditional two-node MSCS cluster, there is potential for rogue changes.