Is this idea about distributed database server with centralized storage feasible

deploymentfile-serversqlite

I often use SQLite for creating simple programs in companies. The database is placed on a file server. This works fine as long as there are not more than about 50 users working towards the database concurrently (though depending on whether it is reads or writes). Once there are more than this, they will notice a slowdown if there are a lot of concurrent writing on the server as lots of time is spent on locks, and there is nothing like a cache as there is no database server.

The advantage of not needing a database server is that the time to set up something like a company Wiki or similar can be reduced from several months to just days. It often takes several months because some IT-department needs to order the server and it needs to conform with the company policies and security rules and it needs to be placed on the outsourced server hosting facility, which screws up and places it in the wrong localtion etc. etc.

Therefore, I thought of an idea to create a distributed database server. The process would be as follows: A user on a company computer edits something on a Wiki page (which uses this database as its backend), to do this he reads a file on the local harddisk stating the ip-address of the last desktop computer to be a database server. He then tries to contact this computer directly via TCP/IP. If it does not answer, then he will read a file on the file server stating the ip-address of the last desktop computer to be a database server. If this server does not answer either, his own desktop computer will become the database server and register its ip-address in the same file. The SQL update statement can then be executed, and other desktop computers can connect to his directly.

The point with this architecture is that, the higher load, the better it will function, as each desktop computer will always know the ip-address of the database server. Also, using this setup, I believe that a database placed on a fileserver could serve hundreds of desktop computers instead of the current 50 or so. I also do not believe that the load on the single desktop computer, which has become database server will ever be noticable, as there will be no hard disk operations on this desktop, only on the file server.

Is this idea feasible? Does it already exist? What kind of database could support such an architecture?

Edit: I should point out that this idea is not pretty, stabile, best practice, or something I would really be proud of. The reason why I am still interested in the feasibility is that some of my clients are banks, and the bureaucracy involved with gaining access to a database is enormous. Often the project sponsor on such projects needs to be above Vice President level, due to their extreme security concerns with gaining access to servers. Needless to say, this means that there is a lot of work for setting up a Wiki. Later if the Wiki proves to be successful, it should, of course, be migrated onto a proper database server.

Edit2: The reason for this idea is to reduce the risk of Writer Starvation when using SQLite when the database is placed on the file server. This problem is described in section 5.1 here. Utilizing a desktop computer to have a cache of the most accessed information (i.e. Wiki pages), would mean that the work load on the file server would be reduced dramatically. This again should improve the user experience. Do you really think that I am still way off with this idea?

Best Answer

You could actually build a good distributed database environment if you partition (or target) your reads and writes at different databases. We do such work, and the trick is very simple. You have the master database on a file server and target all writes to it. You have a local copy of the database on every user's computer and you target the reads to it. You now also need a synchronizing mechanism between the master database and the local databases. This can be done in multiple ways. One way is to have a "delta" table in the master database. This delta table will contain the transactions that have been applied in the master database. Whenever the user's application performs a read or write operation, the delta on the master is first checked and updated locally. Only the transactions in the delta not yet applied (which can be checked based on time stamp) need to be applied. You could even have a background process doing this continuously. This delta could be a daily delta (or a weekly delta) when it is flushed. If a user has not logged on for a week or so, you just simply copy over the whole database to the user's computer. The advantage of having a local copy is that users can query stuff even when they are offline and - believe it or not - this is pretty fast even when you are online updating stuff.

Related Topic