So I have a relatively simple system. A mobile client creates records in a sqlite database that I would like to have synced to a remote SQL server (that is shared with other mobile clients). So when I create a new record in the phone's sqlite table, I then push that change to my remote service through a RESTful API. The problem I'm having, is how do I order the primary keys so that there isn't collisions in the data (i.e. a record in the phone has the same primary key as a completely different record on the server). What is the usual "best practice for referencing the record on the client, and for referencing the same record on the server?
Referential Integrity – Maintaining Between Mobile Client and Server
sqlweb services
Related Solutions
If you must generate keys on the clients, then you'll need some way to avoid collisions (two clients creating the same key). You can do this using GUIDs, because GUIDs are huge and the algorithms that generate them have been designed to minimise collisions. If you must use INTs, then you'll need to roll your own method, probably having each client reserve a block of IDs ahead of time, which is just going to be convoluted, so don't do that.
It's much easier to generate keys on the server, because it's always around and it knows what has and hasn't been used. In SQL Server, you can mark your id column with IDENTITY , to have SQL Server automatically generate an ID when you insert a record into that table. You can get the value that SQL Server used in the insert you just made using the @@IDENTITY function or the SCOPE_IDENTITY function. If you are using a stored procedure you can return the identity values as an output parameter or in the resultset, or you can SELECT SCOPE_IDENTITY()
from your client.
Deep down, in the guts of a relational database, its all rows and columns. That is the structure that a relational database is optimized to work with. Cursors work on individual rows at a time. Some operations create temporary tables (again, it needs to be rows and columns).
By working with only rows and returning only rows, the system is able to better deal with memory and network traffic.
As mentioned, this allows for certain optimizations to be done (indexes, joins, unions, etc...)
If one was to want a nested tree structure, this requires that one pulls all the data at once. Gone are the optimizations for the cursors on the database side. Likewise, the traffic over the network becomes one big burst that can take much longer than the slow trickle of row by row (this is something that is occasionally lost in today's web world).
Every language has arrays within it. These are easy things to work with and interface with. By using a very primitive structure, the driver between the database and program - no matter what language - can work in a common way. Once one starts adding trees, the structures in the language become more complex and more difficult to traverse.
It isn't that hard for a programing language to convert the rows returned into some other structure. Make it into a tree or a hash set or leave it as a list of rows that you can iterate over.
There is also history at work here. Transferring structured data was something ugly in the days of old. Look at the EDI format to get an idea of what you might be asking for. Trees also imply recursion - which some languages didn't support (the two most important languages of the old days didn't support recursion - recursion didn't enter Fortran until F90 and of the era COBOL didn't either).
And while the languages of today have support for recursion and more advanced data types, there isn't really a good reason to change things. They work, and they work well. The ones that are changing things are the nosql databases. You can store trees in documents in a document based one. LDAP (its actually oldish) is also a tree based system (though its probably not what you're after). Who knows, maybe the next thing in nosql databases will be one that returns back the query as a json object.
However, the 'old' relational databases... they're working with rows because thats what they're good at and everything can talk to them without trouble or translation.
- In protocol design, perfection has been reached not when there is nothing left to add, but when there is nothing left to take away.
Best Answer
The normal practice is to structure your database with
uniqueidentifier
keys (sometimes called UUIDs or GUIDs). You can create them in two places without realistic fear of collision.Next, your Mobile app needs to sync "fact" tables from the server before you can create new rows. When you create new rows, you do it locally, and when you sync again, new rows are added to the server. You can do the same thing with updates and deletes too.
To track inserts you need a Created timestamp on your rows. To track updates you need to track a LastUpdate timestamp on your rows. To track deletes you need a tombstone table.
Note that when you do a sync, you need to check the time offset between the server and the mobile device, and you need to have a method for resolving conflicts. Inserts are no big deal (they shouldn't conflict), but updates could conflict, and a delete could conflict with an update.
There are frameworks to handle this kind of thing, such as Microsoft Sync Framework.