REST – What Does HATEOAS Have to Do with Application State?

hateoasrest

HATEOAS is an acronym for "Hypermedia As The Engine Of Application State". What is the "Engine of Application State" referring to, and particularly – how is "hypermedia" the engine of it?

As far as I have been able to understand, HATEOAS, and associated standards like HAL, address the "discoverability" part of REST.

The Spring discussion about it summarises as:

With HATEOAS, the output makes it easy to glean how to interact with
the service without looking up a specification or other external
document.

What it appears to be saying is that when you do HATEOAS compliant responses, using for example HAL compliant JSON, then the client does not have to hardcode any resource path except the root API URL.

Which makes perfect sense, except that it seems to have nothing to do with "Application State". At best, it is to with Server Configuration (IE if I change the URL for a resource (Server Configuration), a consumer can still work out where to find it.

Elaborating a little on what I've been able to glean that HATEOAS is about, there is this excerpt from the same description page. What is shows is that HATEOAS solved the problem of discovering where resources are. But it doesn't seem to relate to "Application State"….

A simple JSON presentation is traditionally rendered as:

{ 
    "name" : "Alice"
}

The customer data is there, but the data contains nothing about its relevant links.

A HATEOAS-based response would look like this:

{
    "name": "Alice",
    "links": [ {
        "rel": "self",
        "href": "http://localhost:8080/customer/1"
    } ]
}

This response not only has the person's name, but includes the self-linking URL where that person is located.

Best Answer

A web application, RESTful or not, is generally not simply a data service; it exposes various resources and provides some behavior, and so it has state; a distinction is made between resource state (client-independent, managed by the application), and application state, which is client-specific. A stateless application doesn't store (client-specific) application state; instead, it lets the client be responsible for it, and provides an API that makes it possible to transfer (application) state back and forth (thus "State Transfer" in REST). From the perspective of a client, a web application is a state machine, residing behind an API that allows for interaction, but part of the state is provided as contextual information by the clients, to supplement requests.

Now, while studying REST, You may have stumbled upon something called The Richardson Maturity Model - it describes the "maturity" of a web application API (evolving over the years), but it's useful to us as sort of reference that puts things in context. In this model, all of the maturity levels, except for the final one, essentially provide APIs that facilitate RPCs over HTTP. In this style of API design, HTTP is used as a transport mechanism, but the actual communication happens over custom protocols, so the interacting systems (clients and the application) rely on so called "out-of band information" to communicate (in this context, this just means that these systems communicate using some custom standard rather then leveraging hypertext/hypermedia and, say, the existing HTTP protocol). So what drives state transfer and state transitions ("the engine of application state") is not hypermedia in this case.

The final maturity level introduces the HATEOAS constraint, and only then the API becomes RESTful. The client initiates the interaction through the initial URI; the server responds with a suitable hypermedia-based representation of the application state (which may differ between devices, or clients, or due to various conditions - thus "Representational" in REST), which includes self-describing actions that let the client initiate the next state transition (so hypermedia is now what directly supports and drives application state).

Can I pick up on the statement that "Application State is client specific". That is how I understand it. [...] It has all to do with the server-side resource availability, nothing (apparent that I can see) to do with the client state...

Let me first make sure that we are on the same page. This is not client state (as in, the internal state of the client itself), but rather, the state of the web application that's particular to a specific client.

The example you mention doesn't really illustrate it well, but the list of links returned is essentially dynamically generated on the server and represents currently available state transitions, and as such, it encodes the current application state (for that particular client). Note that you may choose to transfer other bits of state-related information (in both directions) if your application requires it (so you are not limited to state transition metadata), but the constraint is to never remember any client-specific data on the server, because that hurts salability. Note also that this state doesn't have to be complete (doesn't have to be entirely meaningful when you look at it in isolation), but it has to be enough for the receiving party to make a decision and perform logic based on it and nothing else (so, no out-of-band information should be required, besides what's taken to be the common ground for the network (the standard protocols and data formats used)).

HATEOAS leverages the uniform interface (the common standards and data exchange formats) to decouple the clients and servers so that on the server side they can shuffle stuff around behind the "contract" defined by the hypermedia type, but also because communication based on out-of-band information (custom protocols) often doesn't leverage network infrastructure in a way that REST aims to do. (See the discussion below.)

In your example, the client wouldn't base its logic on the URI, but on metadata (or annotations), like the "rel" attribute. E.g., a browser doesn't care about the URIs in links, it just needs to know what kind of link it is (a clickable link, a form from which you can construct an URI, a stylesheet reference, etc.)

REST in Context

Unfortunately, REST has become a buzzword, and everybody is talking about how to be RESTful, but the entire context of REST is missing, and you cannot make sense of the REST architectural style without understanding this context (what is the problem that REST is actually trying to solve).

REST is a generalization of the architecture behind the Web. For most of us, the Web is a platform. But for people who developed REST, WWW is an application, which has a certain set of requirements, running on a world-wide network. So REST is meant for systems that are like the Web in some important respects, that need to satisfy a certain set of properties.

These are large-scale networked systems that are long-lived (think decades). These are systems that span organizational boundaries (e.g. collaborating companies), or boundaries between different subentities in a large organization (like different divisions, and even teams). Even though there's collaboration, the entities involved all largely operate (do work, and develop and deploy software) on their own terms, at their own pace, with their own security concerns, and they all use different devices, operating systems, etc. They need to access, and share references to, each-other's resources (documents, services, data), while being able to evolve independently and incrementally, without having to do extensive coordination (heck, it's hard to get people to do a coordinated deployment even within the same organization).

Those providing services need to be able to do things like evolve service versions, add nodes, or shuffle data around with minimal effect on clients. They need to scale. Clients (which may themselves be services) need to keep working despite of all this activity on the server side. The systems are likely to be used in unanticipated ways in the future. The resources they access and exchange may be of many different types, and realized (coded, typed, represented, structured) internally (by service providers) in many different ways, but even so, for the overall system/network, a consistent way to access resources and structure data and responses is required (uniform interface).

REST takes all this into account, and very much takes into consideration the properties of the network. It is meant to address the needs of applications that are, on a high level (within their own business domain), similar in terms of requirements and constraints to what's outlined above (or aspire to be).

And it does, but it's not a panacea. There are costs and trade-offs. It imposes a certain communication style, and there's a performance hit. You are always transferring data in a coarse-grained way, often the same data repeatedly, in a format that is generalized (and so may not be the most efficient for your service), with a bunch of metadata, often across intermediate network nodes (thus all the caching on the Web - it minimizes sending data back and forth, keeping the client away from the service when possible). User-perceived performance is important, which affects how you write clients (this is why browsers start rendering the page before everything is downloaded or ready). But you happily pay that cost to be able to build a system with the properties described above.

Conversely, if you are building a system that has different requirements/constraints, the cost may not be worth it.

Related Solutions

Handling Resource Identifiers in REST API Clients

Edited to address question updates, previous answer removed

Looking over your changes to your question I think I understand the problem you are facing a bit more. As there is no field that is an identifier on your resources (just a link) you have no way to refer to that specific resource within your GUI (i.e. a link to a page describing a specific pet).

The first thing to determine is if a pet ever makes sense without an owner. If we can have a pet without any owner then I would say we need some sort of unique property on the pet that we can use to refer to it. I do not believe this would violate not exposing the ID directly as the actual resource ID would still be tucked away in a link that the REST client wouldn't parse. With that in mind our pet resource may look like:

<Entity type="Pet">
    <Link rel="self" href="http://example.com/pets/1" />
    <Link rel="owner" href="http://example.com/people/1" />
    <UniqueName>Spot</UniqueName>
</Entity>

We can now update the name of that pet from Spot to Fido without having to mess with any actually resource IDs throughout the application. Likewise we can refer to that pet in our GUI with something like:

http://example.com/GUI/pets/Spot

If the pet does not make any sense without an owner (or pets are not allowed in the system without an owner) then we can use the owner as part of the "identity" of the pet in the system:

http://example.com/GUI/owners/John/pets/1 (first pet in the list for John)

One small note, if both Pets and People can exist separate of each-other I would not make the entry point for the API the "People" resource. Instead I would create a more generic resource that would contain a link to People and Pets. It could return a resource that looks like:

<Entity type="ResourceList">
    <Link rel="people" href="http://example.com/api/people" />
    <Link rel="pets" href="http://example.com/api/pets" />
</Entity>

So by only knowing the first entry point into the API and not processing any of the URLs to figure out system identifiers we can do something like this:

User logs into the application. The REST client accesses the entire list of people resources available which may look like:

<Entity type="Person">
    <Link rel="self" href="http://example.com/api/people/1" />
    <Pets>
        <Link rel="pet" href="http://example.com/api/pets/1" />
        <Link rel="pet" href="http://example.com/api/pets/2" />
    </Pets>
    <UniqueName>John</UniqueName>
</Entity>
<Entity type="Person">
    <Link rel="self" href="http://example.com/api/people/2" />
    <Pets>
        <Link rel="pet" href="http://example.com/api/pets/3" />
    </Pets>
    <UniqueName>Jane</UniqueName>
</Entity>

The GUI would loop through each resource and print out a list item for each person using the UniqueName as the "id":

<a href="http://example.com/gui/people/1">John</a>
<a href="http://example.com/gui/people/2">Jane</a>

While doing this it could also process each link that it finds with a rel of "pet" and get the pet resource such as:

<Entity type="Pet">
    <Link rel="self" href="http://example.com/api/pets/1" />
    <Link rel="owner" href="http://example.com/api/people/1" />
    <UniqueName>Spot</UniqueName>
</Entity>

Using this it can print a link such as:

<!-- Assumes that a pet can exist without an owner -->
<a href="http://example.com/gui/pets/Spot">Spot</a>

<!-- Assumes that a pet MUST have an owner -->
<a href="http://example.com/gui/people/John/pets/Spot">Spot</a>

If we go with the first link and assume that our entry resource has a link with a relation of "pets" the control flow would go something like this in the GUI:

Page is opened and the pet Spot is requested.
Load the list of resources from the API entry point.
Load the resource that is related with the term "pets".
Look through each resource from the "pets" response and find one that matches Spot.
Display the information for spot.

Using the second link would be a similar chain of events with the exception being that People is the entry point to the API and we would first get a list of all people in the system, find the one that matches, then find all pets that belong to that person (using the rel tag again) and find the one that is named Spot so we can display the specific information related to it.

REST – What Does HATEOAS Offer for Discoverability and Decoupling?

I think your instincts are largely correct; those proclaimed benefits really aren't all that great, as for any non-trivial web application the clients are going to have to care about the semantics of what they're doing as well as the syntax.

But that doesn't mean that you shouldn't make your application follow the principles of HATEOAS!

What does HATEOAS really mean? It means structuring your application so that it is in principle like a web site, and that all operations that you might want to do can be discovered without having to download some complex schema. (Sophisticated WSDL schemas can cover everything, but by the time they do, they've exceeded the ability of virtually every programmer to ever understand, let alone write! You can view HATEOAS as a reaction against such complexity.)

HATEOAS does not just mean rich links. It means using the HTTP standard's error mechanisms to indicate more exactly what went wrong; you don't have to just respond with “waaah! no” and can instead provide a document describing what was actually wrong and what the client might do about it. It also means supporting things like OPTIONS requests (the standard way of allowing clients to find out what HTTP methods they can use) and content type negotiation so that the format of the response can be adapted to a form that clients can handle. It means putting in explanatory text (or, more likely, links to it) so that clients can look up how to use the system in non-trivial cases if they don't know; the explanatory text might be human readable or it might be machine readable (and can be as complex as you want). Finally, it means that clients do not synthesise links (except for query parameters); clients will only use a link if you told it to them.

You have to think about having the site browsed by a user (who can read JSON or XML instead of HTML, so a little weird) with a great memory for links and an encyclopædic knowledge of the HTTP standards, but otherwise no knowledge of what to do.

And of course, you can use content type negotiation to serve up an HTML(5)/JS client that will let them use your application, if that's what their browser is prepared to accept. After all, if your RESTful API is any good, that should be “trivial” to implement on top of it?

Best Answer

Related Solutions

Handling Resource Identifiers in REST API Clients

REST – What Does HATEOAS Offer for Discoverability and Decoupling?

Related Topic