I played with heartbeat a year or so ago and while it was ok, I was not convinced it was 100% there yet, it lacked monitoring capabilities of resources it managed which was a major drawback in my mind.
I again had a need for a cluster recently so had another look. Since the previous time they’ve released Heartbeat version 2 which has a full resource manager like you’ll find in more mature systems.
The only time I’ve previously used a cluster system extensively was on Windows 2003 Enterprise with MS SQL Server Enterprise, the setup then was a active-passive SQL cluster with shared fibre storage. The windows cluster services works quite well, have a solid GUI and supports many nodes in a cluster.
I wouldn’t yet compare Heartbeat to commercial offerings, technically the Cluster Resource Manager introduced in version 2 is a massive step forward but configuration can be a real nightmare.
The documentation is very thin on the ground and configuration has to be done through XML files. There is a disturbing trend these days for people to think XML is an acceptable form of configuration from a human point of view but it really is not. Worse is the DTD for the XML format is the definitive source for configuration reference, as their WIKI states:
It was out of date and didn’t take into account the fact that not everyone is on the same version. Instead, you should refer to crm.dtd on your system (which is always appropriate to your version).
Heartbeat does provide a GUI but I found it immature, inconsistent and often had error messages pop up with no contents in them other than an ‘OK’ button. It also lacked some features, while evaluating it I decided if I had to rely on the GUI in any way as it stands today I would not use Heartbeat for my cluster as it would invalidate any high availability hopes I had. It is useful though to monitor and visualise your cluster, especially if you have a lot of groups.
Once I figured out the correct XML formats to do what I wanted and learned the command line tools and provided my own documentation for these I eventually got a full 2 node cluster going managing currently 5 resources with more to follow.
My main goal with this project was to manage HAProxy on the cluster not because HAProxy is in any way unstable but because I find it difficult to do maintenance with just one machine for it and as I adopt HAProxy more the hardware would be an unacceptable single point of failure.
Heartbeat lets you manage resources using several type of scripts, the best one to use would be the new OCF standard scripts which is designed specifically for managing cluster resources but it’s an emerging format so not a lot of scripts exist for it today. Heartbeat also support using standard /etc/init.d/ style rc scripts with the caveat that they have to be 100% LSB compliant. You’d think at least the scripts that Red Hat provide are LSB compliant but you’d be wrong, I had to fiddle with almost each one I wanted to use which is not optimal because I hate editing non-config files delivered with RPM and I think its very poor of RedHat who has been making a point of telling anyone who would listen that they’re completely LSB compliant.
I would also have liked to build a HA NFS server but unfortunately Heartbeat version 2 and DRBD version 8 does not yet play nicely, so that is a project for some other time.
My conclusions on Heartbeat then is that it is a good solid project especially with version 2, I think in a year or so once documentation etc had a chance to mature it would be a good choice for almost anyone, for now though it is unfortunately out of reach for the average guys.