Hello, this is Jan Lehnardt and you're visiting my blog. Thanks for stopping by.
plok — It reads like a blog, but it sounds harder!
↑ Archives
This is a distilled version of the talk I gave at the phpug Dortmund in March. I’ll concentrate on the High Availability aspect here but I won’t leave out Load Balancing completely. Also, this is in English while the talk was in German.
High Availability and Load Balancing are often used in combination if not synonymous which is as wrong as it can get. They certainly do have some things in common (several machines do something) and they are often needed at similar times (when a website becomes more important) but the fundamental ideas are quite different.
While Load Balancing spreads incoming http requests over several machines this is not a highly available setup. High Availability is all about eliminating single points of failure (spofs). And if you have a load balancer (be it the 2k black box with magic features or the 1k linux box with magic features) that spreads up your load, this is a single point of failure.
The definition of a spof is pretty simple. If you have spots in your architecture where every http request must go through and there is no other way, this spot is a spof. It might also be a performance bottleneck but that has to be approached with Performance Tuning or Load Balancing, but not High Availability. spofs can appear at all levels that you might or might not expect. Or care for. For you, it might be okay if your data center is offline for an hour or so, but for somebody like Yahoo or Amazon, this is not an option. So a data center can be a spof, it just has to apply to your size. Some companies go even further and see the whole Internet as a spof and place mini data centers at central exchange points (like de-cix) or even at providers (there are other reasons to do so, latency, but still). So a spof can be anything from a machine, a cable, a switch, a router, or even a data center. You have to decide independently what applies to you.
Different SPOFs call for different ways to eliminate them, but one principle is always apparent: Introduce redundancies. Get multiple peerings and power supplies in your data center, get multiple routers, routes, switches and cables and finally get multiple machines to jump into action when others fail. One popular example and probably a very simple one is the hot spare. A pair of identical machines is set up for a service and only one gets used. When the first fails, the second takes responsibility for the job. This is nice and can easily be extended to having hot hot spares and hot hot hot spares but then it can get costly. A flaw is obvious: You have expensive hardware that has to be maintained and taken care of that sits there idling, doing absolutely nothing, not even watching TV. If you were into Load Balancing (which we’re not, remember, High Availability), you’d want the hot spares to help you with the load.
Taking the example setup from above where you have a physical load balancer in place, you’d need a hot spare which can get quite pricey soonish. Especially if you considering spanning your site over multiple data centers, either for High Availability or latency reasons, you need to replicate your setup and buy yet another load balancer; and a hot spare.
A common idea for spreading http load is using round-robin dns where a single domain resolves to a couple of ip addresses instead of one. dns will more or less randomly pick an ip address and send it back to the user who now does a http request to that IP. This works quite well in practice, but it doesn’t solve the High Availability problem. If one of two machines dies, only 50% of your visited don’t get through. And that’s quite a lot. Changing ip addresses inside dns based on availability is not feasible because of caching.
Another common practice, for static content, is to create subdomains like images1.example.com and images2.example.com that get a new number each time a machine is added. The application is responsible for picking random servers. This can become a problem if you’re doing ssl because you’re going to need a certificate for every domain. This is expensive and doesn’t scale well (More on proper scaling in a later posting).
Introducing the fun stuff. Wouldn’t it be cool if you could just use all your machines and have them at work in a highly available manner? It would, so how do we do it?
Getting back to the round-robin DNS-based solution from above. All we need is a device that supervises all machines and, in case of a failure, reassigns ip addresses to working machines. Such a device can be a little program (or two) that runs on each machine in the group, so the machines watch-dog each other and don’t rely on an external machine (which again would need a hot spare).
The first step to get there is to install Spread. Spread is a toolkit for group communication between computers. It defines things like groups, memberships, reliable and unreliable messaging inside of groups and so on. What we do now is to install Spread on each web server and tell it (each time) which machines belong to the web server group. They now can talk to each other and form groups and all, but that doesn’t solve any problems yet.
Enter Wackamole. Wackamole is a Spread service, if you will, that takes Spread’s concept of groups and applies it to managing ip addresses. You install Wackamole and tell it to create a new group for all machines. You assign this group the set of ip addresses that your dns server gives out for your domain. Now this group is responsible for the ip addresses your domain runs on. In a two machine — two ip address setup every member gets one ip and serves requests as they come in. Note again, that we do very basic load balancing here which is, while basic, often sufficient. But this is still High Availability.
Now a machine dies. The Spread based communication quickly (the time is configurable but 5 seconds proved reasonable) finds out about it and, now with Wackamole, the remaining members decide who get the ip that was assigned to the dead machine. The requests that come in in the meantime, will not be successful but once the ip is taken over, everything seems up and running again from a user’s perspective. Unfortunately is changing ip addresses not all you need. Your ip addresses get traffic routed to and if a router still thinks a ip belongs to a certain mac address (this is how it works), the requests sill fail. Fixing this requires ARP, the Address Resolution Protocol. You can tell Wackamole which machines should be notified if a ip takeover happens. You usually want your router or routers to know this. Wackamole then sends a special arp request to that ip that invalidates the arp cache for that machine, forcing it to look up the new route for the transferred ip. Neat eh?
You fix the machine and bring it back online, Spread/Wackamole see this and immediately (~5 seconds again) starts to balance the number of ip addresses in the group. Meaning, the revitalized machine gets assigned a number (not necessarily the one it had before) and can start serving content again.
Static content is easy, but you want to run dynamic content. This usually comes with some problems. http is stateless and sessions are used to work around this. By default, php sessions get stored onto the server’s disk. If a user happens to swap servers in a session, it gets lost. So Sessions must be stored in a place central for all web servers. That can be a database, or memcached or what you might find suitable.
ssh is often used and not a problem here, because only a single domain name is served, you don’t need multiple certificates but only one. All web servers just get it.
And finally, you also want to store your http logfiles in a central location. You can again use a database for that, or again Spread, but we’ll look at that in another posting.
You can read about this and much more in Theo Schlossnagle’s Book Scalable Internet Architecture, a must read for everybody who deals with issues described above. It is fairly thin for the price, but every page is worth it, both content- and entertainment-wise it and you won’t regret the purchase.