Linux has a well-deserved reputation for being a reliable server operating system; however, even that is not always a guarantee of reliability. Software can never be bug-free and hardware can fail. In this column, we examine Linux’s high availability features and how they can be used to build robust high-performance Linux servers.
‘High availability’ is a term covering a number of different technologies that can be employed to improve server reliability and performance significantly. Of these technologies, the most significant are failover systems and load-balancing. Other technologies such as RAID and distributed file systems can also be associated with high availability, but are not strictly high availability technologies.
A failover system is used to ensure constant uptime of a single server. The system requires two identical servers, a master and a slave, to operate in parallel. The master system is visible to the outside world but the slave is not. The two servers are directly connected either by a serial or network cable and constantly send each other what is known as a heartbeat. The heartbeat is a message indicating the sending server is online and operating properly. A failover system works by relying on the regular transmission of these messages. If at any point the slave system does not receive a heartbeat from the master, the slave system is configured to realise the master system is not operating correctly and to take over the network address of the master system and begin providing services as if it was the master. When the master starts sending heartbeats again, the slave will yield the network address to the master server and resume monitoring the master server.
Failover systems are commonly employed where constant uptime is a vital service requirement, such as in firewalls and database servers. The drawback to a failover system is the duplication of hardware and data required. The slave system must be identical to the master, so the content being served must be relatively static and updated on both systems simultaneously in order to avoid data loss. A shared network data storage can overcome this problem.
A second high availability technology — load balancing — can also be used to resolve many of the drawbacks of a failover system.
A load balancing system shares connections and load evenly between several identical servers that are indistinguishable to the user, thereby improving scalability and performance. A load balancing system typically requires a single dedicated server to act as the load balancer and a number of other servers located behind the load balancer to serve requests. Only one server, the load balancer, is visible to the user.
Load balancers are commonly employed where reliable performance beyond the capability of a single computer is required. An example of such a situation is popular Web sites such as search engines and portals.
Load balancing and failover systems can be combined to provide very high availability. Typically, a failover system is employed on the load balancer to ensure constant uptime of the system visible to all users. The result of this is the separation of the data being served and the system that is accessible to users. This overcomes the difficulty of keeping data current on the failover server, as this data is now located behind the load balancer on any number of invisible servers. Downtime on the invisible servers is acceptable, as the load balancer can redirect requests intended for an offline server to another in the server farm.
The Linux HA project (www.linux-ha.org) has developed a failover system named Heartbeat. It includes a heartbeat transmitter, monitor and IP address takeover software. Heartbeat is included in many major distributions, including Mandrake and SuSe. Detailed installation instructions are available from the Linux HA Web site.
Red Hat uses the open source Kimberlite (http://oss.missioncriticallinux.com/projects/kimberlite) clustering system to provide failover services in the Red Hat Advanced Server distribution. This advanced clustering solution focuses on a failover system built from two servers and shared SCSI storage. Kimberlite is targeted at commercial environments.
The Linux Virtual Server project (www.linuxvirtualserver.org) has created a number of kernel patches and tools for building a load balancing server with Linux. Software is currently available for 2.2.x, 2.4.x and 2.5.x series kernels. Detailed instructions on how to build a load balancer are also available from the Linux Virtual Server Web site.
Be sure to check their Web sites for more information on high availability technologies under Linux.