Where and what are the resources?
REST is all about addressing resources in a stateless, discoverable manner. It does not have to be implemented over HTTP, nor does it have to rely on JSON or XML, although it is strongly recommended that a hypermedia data format is used (see the HATEOAS principle) since links and ids are desirable.
So, the question becomes: How does one think about synchronization in terms of resources?
What is bi-directional sync?**
Bi-directional sync is the process of updating the resources present on a graph of nodes so that, at the end of the process, all nodes have updated their resources in accordance with the rules governing those resources. Typically, this is understood to be that all nodes would have the latest version of the resources as present within the graph. In the simplest case the graph consists of two nodes: local and remote. Local initiates the sync.
So the key resource that needs to be addressed is a transaction log and, therefore, a sync process might look like this for the "items" collection under HTTP:
Step 1 - Local retrieves the transaction log
Local: GET /remotehost/items/transactions?earliest=2000-01-01T12:34:56.789Z
Remote: 200 OK with body containing transaction log containing fields similar to this.
itemId
- a UUID to provide a shared primary key
updatedAt
- timestamp to provide a co-ordinated point when the data was last updated (assuming that a revision history is not required)
fingerprint
- a SHA1 hash of the contents of the data for rapid comparison if updateAt
is a few seconds out
itemURI
- a full URI to the item to allow retrieval later
Step 2 - Local compares the remote transaction log with its own
This is the application of the business rules of how to sync. Typically, the itemId
will identify the local resource, then compare the fingerprint. If there is a difference then a comparison of updatedAt
is made. If these are too close to call then a decision will need to be made to pull based on the other node (perhaps it is more important), or to push to the other node (this node is more important). If the remote resource is not present locally then a push entry is made (this contains the actual data for insert/update). Any local resources not present in the remote transaction log are assumed to be unchanged.
The pull requests are made against the remote node so that the data exists locally using the itemURI
. They are not applied locally until later.
Step 3 - Push local sync transaction log to remote
Local: PUT /remotehost/items/transactions
with body containing the local sync transaction log.
The remote node might process this synchronously (if it's small and quick) or asynchronously (think 202 ACCEPTED) if it's likely to incur a lot of overhead. Assuming a synchronous operation, then the outcome will be either 200 OK or 409 CONFLICT depending on the success or failure. In the case of a 409 CONFLICT, then the process has to be started again since there has been an optimistic locking failure at the remote node (someone changed the data during the sync). The remote updates are processed under their own application transaction.
Step 4 - Update locally
The data pulled in Step 2 is applied locally under an application transaction.
While the above is not perfect (there are several situations where local and remote may get into trouble and having remote pull data from local is probably more efficient than stuffing it into a big PUT) it does demonstrate how REST can be used during a bi-directional synchronization process.
Is it appropriate to mix some sort of action call with a resource URI (e.g. /collection/123?action=resendEmail
)? Would it be better to specify the action and pass the resource id to it (e.g. /collection/resendEmail?id=123
)? Is this the wrong way to be going about it? Traditionally (at least with HTTP) the action being performed is the request method (GET, POST, PUT, DELETE), but those don't really allow for custom actions with a resource.
I'd rather model that in a different way, with a collection of resources representing the emails that are to be sent; the sending will be processed by the internals of the service in due course, at which point the corresponding resource will be removed. (Or the user could DELETE the resource early, causing a canceling of the request to do the send.)
Whatever you do, don't put verbs in the resource name! That's the noun (and the query part is the set of adjectives). Nouning verbs weirds REST!
I use the querystring portion of the URL to filter the set of resources returned when querying a collection (e.g. /collection?someField=someval
). Within my API controller I then determine what kind of comparison it is going to do with that field and value. I've found this really doesn't work. I need a way to allow the API user to specify the type of comparison they want to perform.
The best idea I've come up with so far is to allow the API user to specify it as an appendage to the field name (e.g. /collection?someField:gte=someval
- to indicate that it should return resources where someField is greater than or equal to (>=
) whatever someval
is. Is this a good idea? A bad idea? If so, why? Is there a better way to allow the user to specify the type of comparison to perform with the given field and value?
I'd rather specify a general filter clause and have that as an optional query parameter on any request to fetch the contents of the collection. The client can then specify exactly how to restrict the set returned, in whatever way you desire. I'd also worry a bit about the discoverability of the filter/query language; the richer you make it, the harder it is for arbitrary clients to discover. An alternative approach which, at least theoretically, deals with that discoverability issue is to allow making restriction sub-resources of the collection, which clients obtain by POSTing a document describing the restriction to the collection resource. It's still a slight abuse, but at least it's one you can clearly make discoverable!
This sort of discoverability is one of the things that I find least strong with REST.
I often see URI's that look something like /person/123/dogs
to get the persons dogs. I generally have avoided something like that because in the end I figure that by creating a URI like that you are actually just accessing a dogs collection filtered by a specific person ID. It would be equivalent to /dogs?person=123
. Is there ever really a good reason for a REST URI to be more than two levels deep (/collection/resource_id
)?
When the nested collection is truly a sub-feature of the outer collection's member entities, it is reasonable to structure them as a sub-resource. By “sub-feature” I mean something like UML composition relation, where destroying the outer resource naturally means destroying the inner collection.
Other types of collection can be modeled as an HTTP redirect; thus /person/123/dogs
can indeed be responded to by doing a 307 that redirects to /dogs?person=123
. In this case, the collection isn't actually UML composition, but rather UML aggregation. The difference matters; it is significant!
Best Answer
First, an alternative to consider...
After many years of designing and implementing web services (and inheriting some rather sub-par implementations as well), I've reached a conclusion that some others have as well: Avoid nested resource paths.
Matthew Beale's Suggested REST API Practices article explains the reasoning behind this. (Look in the First-class Models section.)
But if you insist...
If you have to (or prefer to) use nested resource paths, I would suggest grouping methods based on what type of resource they return. So
/users/:id/games
would be implemented inGamesController
. My reasoning here is that it maintains a more consistent correlation between API classes and DAL classes, which helps avoid guesswork later on, and reduces the number of inter-class dependencies.I think your querystring suggestion is the cleanest, basically for the same reasons as #1 (it returns a game, so it should be under the
/games
resource, and implemented inGamesController
). Something like/games?keyword=hot&limit=1
resembles the pattern of successful, intuitive API approaches I've seen used elsewhere.