I've developed interest in VPS clusters lately and it seemed like a great idea to try to set up a multi-VPS setup, which I will describe below. It isn't really meant for production use, but more as an experiment to improve relevant skills and knowledge of multi-server systems. However since I lack the knowhow, I need some general information.

Setup description

Since I deal daily with Node.js applications that use Redis as datastore, it would serve as the basis of the setup. What I had in mind was basically a minimum of 2 VPS setup. Each of the servers would be running the same Node services (let's say 5 different services at once on both of the servers), also each VPS runs one instance of Redis, which is used by the Node services to store data. The aim of this setup is to enable the data mirroring between both servers (if Node service #1 in the first server added something to Redis, the change should also be reflected in the second server). This would also mean that uploaded files etc should be mirrored on both servers (in a sense that changes in both datastores and filesystems must be reflected in the other).

Ideally, this would enable simple load-balancers that share the load between servers and in case of a single server failure, the other servers would keep running and therefore keeping Node services online. The speed at which the changes need to be mirrored is not high and even a minute-long delay wouldn't really matter. However if a user is actively updating data, he must be presented with the data he just changed (in the sense that he must somehow be forced to communicating with the server he updated the data in, since the changes might not have been mirrored on the other servers yet).


  1. What are the reasonable ways to achieve load balancing? I've heard
    about using some DNS magic, but don't really understand it. Simply updating DNS
    records would be too slow since they are cached in multiple places. Also I
    read about using one "main proxy" server that would handle balancing
    between other servers. This seems a bit risky because if the main
    server failed, everything would be offline.

  2. How to mirror parts of filesystem on different VPS's so that
    uploaded images etc would be present on both servers. Are there any
    wide-spread software options, or would a simple script that detects
    uploads and then replicates those files on different servers work
    just fine?

  3. Does Redis even support the kind of mirroring I described? I only
    found information about master-slave replication, which, if I
    understood correctly, means that the updates are one-way, in the
    sense that master can update the slave, but the slave cannot update
    the master.

Thank you!

Best Answer

Can help with Question 1 only.

There's several approches to load balancing and failover (simpliest-first)

  1. DNS round robing (load balancing and failover)
  2. Dynamic DNS (failover)
  3. Proxies (Load balancing and failover)
  4. Local IP failover (failover)
  5. BGP Anycast (load balancing and failover)

DNS load balancing is simple: Say you have two (or more) servers with IPs and To setup DNS load balancing, you create DNS records for your hostname, say www.example.com:

www.example.com. A

(Also, DNS server should be configured to serve this name in round-robin mode, but it's usually the default anyway).

Now each DNS request to www.example.com will be replied with two addresses, in a pseudo-random order, and thus your clients are likely to equally spread between the servers.

There's no need to update records frequently, once it setup it works forever. It also provides some degree of failover, as if one the hosts is down, browsers will time-out and then try the second host, BUT there may be considerable delay and users won't like it.

Dynamic DNS. Possible addition to 1., is once given host fails, dynamically update DNS records and remove referral to the failed host, but lots of caching in DNS system causes that there will be some period of degraded behavior I mentioned above. Using very low TTL improves situation but still there's caching inside client OS/browser that won't regard TTL, also some ISPs don't disregard low TTLs too. Anyhow, bottomline - it's very easy and affordable way to achieve balancing and basic failover.

Proxies. Simple and popular for load balancing. To eliminate single-point-of-failure you need to combine it with other approach(es).

IP Failover. As addition to 2., to cope with failure of proxy itself, TWO proxies used in "IP failover" setup - basic idea is to have one IP address that normally comes up on host1 but once it fails, host2 detects it and the IP comes up on host2. Look for linux "heartbeat" project. (You may also failover servers themselves without proxies, but you won't have balancing). Normally both PCs have to be on the same subnet (same datacenter).

Anycast. idea is to advertise routes to single IP address (actually single subnet) in couple of physical locations. You need your own /24 subnet, and ability to configure BGP. Anycast often used for DNS servers. There's difficulties with persistent TCP connections and thus more easily fits UDP and DNS but still sometimes used for web too.

That's the basic ideas. As you see, every method have limitations and complications. And if it's not complicated enough, you can build any imaginable combination of the above approaches :)

