Architecture – How to implement a lightweight clustered architecture for a distributed application

Architectureclusterdistributed computingframeworksopen source

I currently have a distributed application that runs on multiple embedded PCs. The whole application is composed of one master server and several nodes. Each node is an embedded PC that runs Windows 7 Embedded and has a dual core CPU with 2GB of RAM.

The application (by definition) only works if the master is up and running, controlling all the nodes. The master server has a SQL Express database where it keeps information about every node it's supposed to control and how they are organized. The nodes have no persisted state.

Once the master and the nodes are up and running they will be manipulated and put into a certain state which is only kept in memory at this point. The master can be controlled by a WinForm client UI that connects to it and can read its state and send commands that will change it's state (it's basically a bunch of web services exposed using .NET WCF).

The state kept inside the master (in memory) is what matters. The state inside each node can be regenerated if a node restarts for example. If the master is restarted it loses its current state (and the node states as a consequence). This means after a restart of the master server the configuration will be reloaded and a 'fresh' state will be set.

Typically a setup of this application consists of one master and 9 nodes (this is a 3×3 setup). At any point a node can fail and the application will continue without it (as long as the master is up). If the node that failed comes back the master detects it and put it back to the desired state.

I've been asked to improve the architecture of this application so the master server can run inside one of the nodes. So instead of a 9+1 setup we will have only 9 embedded PCs with one elected to be a master. According to our tests the hardware of the node has enough power to support both the node and the master piece together. However the embedded PC can't be trusted and will fail much more often than a regular server we've used to host the master so far.

Because of that I was asked to come up with a redundancy solution. It's my understanding that the proper solution would be to put two or more embedded running in clusters so if the node running the master piece fails then another one will assume it's place.

Now, the question is: how to implement a lightweight cluster that can run in those conditions?

There are two main concerns that must be solved:

  1. Data Persistency: not only the configuration must be saved but also the master state. That way when the master node goes down another node can assume as master without resetting the entire state of the application.
  2. WCF Clustering: at any point if the master node fails another one must assume and all connected clients (the WinForm client UI) must be able to automatically reconnect to the new re-elected master node. This doesn't have to be really transparent to the user, but the clients must be able to reconnect automatically (doesn't matter if the new IP address will be the same or not).

There are several limiting factors to a possible solution:

  • There is no way to have a data storage shared between the nodes (each node has it's own HD and a private gigabit network between them)
  • Hardware upgrades are out of question
  • The solution have to be light enough so it runs on a embedded PC. So installing a cloud server or a clustered DB probably won't be fast enough (if you think MySQL clustered will work to solve the data layer, I'll be interested in hearing your thoughts)
  • The solution can't involve buying an expensive piece of software
  • The overall platform of this application must be Windows based

The best solution I thought so far was to use something like Prevayler to keep the master state persisted and then implement a sync of every command the master received into the other nodes. That will solve the persistence problem across all nodes (maybe something similar can be implemented using memcache, I'm not sure). I have no solution yet to solve the WCF service problem.

Because this will involve a huge lot of development and proper testing I thought I should hear from you guys before implementing anything.

I think a solution could be put together either by using a framework or some sort of open source software that solves part of the problem.

Please feel free to ask anything so I can improve the text of this question to make it more clear.

Best Answer

Since you are moving away from a single master node (which is appropriate) you will have to change a few things. You will need to setup a Quorum. Since you already have 9 nodes, you are in good shape. For a Quorum to work you need 2n+1 nodes where (n) is the number of nodes that can go down and the system will still work. Within the Quorum a vote will take place on who the leader is, and what transactions are successful. This can be used to pass around configuration information and ensure everyone is synchronized without a database.

There are existing technologies out there that can help you with this. One of thos is ZooKeeper. It is an open source Apache v2 product for Distributed Coordination. You will need something along these lines. Whether it is using ZooKeeper or rolling your own their white papers will be invaluable. It can also be used to maintain your configuration information about each node.

ZooKeeper is written in Java but I have created a project (ZooKeeperNet that will allow it to be embeded within .NET application using IKVM. If this isn't acceptable then you'll want to read about Leader Elections when determining who will be the current Master node. I suggest reading all their Wiki pages and Recipes to get an idea of what you need to account for in a proper distributed system.

Just so you have a good understanding. ZooKeeper is the backing coordination system of Hadoop and HBase. Hadoop is a distributed Map/Reduce framework.

If you already aren't, you can use WCF adhoc or registry discovery information when attempting to find the current master node in your system. If only a single Master node is alive it will be the only registered to support IMaster features. Then your slave nodes will listen on each other's znodes for each other to go away, picking up being the Master almost immediately.

Keep in mind that in order to be high efficient, the data each node needs to work with has to be close (i.e. on the node itself) to the node. If one node acts as a data intermediary you won't be as efficient as you could if the nodes could pull data in a distributed fashion.

Related Topic