REST API Design – URI vs Query String

apidesignrest

Let's say I have three resources that are related like so:

Grandparent (collection) -> Parent (collection) -> and Child (collection)

The above depicts the relationship among these resources like so: Each grandparent can map to one or several parents. Each parent can map to one or several children. I want the ability to support searching against the child resource but with the filter criteria:

If my clients pass me an id reference to a grandparent, I want to only search against children who are direct descendants of that grandparent.

If my clients pass me an id reference to a parent, I want to only search against children who are direct descendants of my parent.

I have thought of something like so:

GET /myservice/api/v1/grandparents/{grandparentID}/parents/children?search={text}

and

GET /myservice/api/v1/parents/{parentID}/children?search={text}

for the above requirements, respectively.

But I could also do something like this:

GET /myservice/api/v1/children?search={text}&grandparentID={id}&parentID=${id}

In this design, I could allow my client to pass me one or the other in the query string: either grandparentID or parentID, but not both.

My questions are:

1) Which API design is more RESTful, and why? Semantically, they mean and behave the same way. The last resource in the URI is "children", effectively implying that the client is operating on the children resource.

2) What are the pros and cons to each in terms of understandability from a client's perspective, and maintainability from the designer's perspective.

3) What are query strings really used for, besides "filtering" on your resource? If you go with the first approach, the filter parameter is embedded in the URI itself as a path parameter instead of a query string parameter.

Thanks!

Best Answer

First

As Per RFC 3986 §3.4 (Uniform Resource Identifiers § (Syntax Components)|Query

3.4 Query

The query component contains non-hierarchical data that, along with data in the path component (Section 3.3), serves to identify a resource within the scope of the URI's scheme and naming authority (if any).

Query components are for retrieval of non-hierarchical data; there are few things more hierarchical in nature than a family tree! Ergo - regardless of whether you think it is "REST-y" or not- in order to conform to the formats, protocols, and frameworks of and for developing systems on the internet, you must not use the query string to identify this information.

REST has nothing to do with this definition.

Before addressing your specific questions, your query parameter of "search" is poorly named. Better would be to treat your query segment as a dictionary of key-value pairs.

Your query string could be more appropriately defined as

?first_name={firstName}&last_name={lastName}&birth_date={birthDate} etc.

To answer your specific questions

  1. Which API design is more RESTful, and why? Semantically, they mean and behave the same way. The last resource in the URI is "children", effectively implying that the client is operating on the children resource.

I don't think this is as clear cut as you seem to believe.

None of these resource interfaces are RESTful. The major precondition for the RESTful architectural style is that Application State transitions must be communicated from the server as hypermedia. People have labored over the structure of URIs to make them somehow "RESTful URIs" but the formal literature regarding REST actually has very little to say about this. My personal opinion is that much of the meta-misinformation about REST was published with the intent of breaking old, bad habits. (Building a truly "RESTful" system is actually quite a bit of work. The industry glommed on to "REST" and back-filled some orthogonal concerns with nonsensical qualifications and restrictions. )

What the REST literature does say is that if you are going to use HTTP as your application protocol, you must adhere to the formal requirements of the protocol's specifications and you cannot "make http up as you go and still declare that you are using http"; if you are going to use URIs for identifying your resources, you must adhere to the formal requirements of the specifications regarding URI/URLs.

Your question is addressed directly by RFC3986 §3.4, which I have linked above. The bottom line on this matter is that even though a conforming URI is insufficient to consider an API "RESTful", if you want your system to actually be "RESTful" and you are using HTTP and URIs, then you cannot identify hierarchical data through the query string because:

3.4 Query

The query component contains non-hierarchical data

...it's as simple as that.

  1. What are the pros and cons to each in terms of understandability from a client's perspective, and maintainability from the designer's perspective.

The "pros" of the first two is that they are on the right path. The "cons" of the third one is that it appears to be flat out wrong.

As far as your understandability and maintainability concerns, those are definitely subjective and depend on the comprehension level of the client developer and the design chops of the designer. The URI specification is the definitive answer as to how URIs are supposed to be formatted. Hierarchical data is supposed to be represented on the path and with path parameters. Non-hierarchical data is supposed to be represented in the query. The fragment is more complicated, because its semantics depend specifically upon the media type of the representation being requested. So to address the "understandability" component of your question, I will attempt to translate exactly what your first two URIs are actually saying. Then, I will attempt to represent what you say you are trying to accomplish with valid URIs.

Translation of your verbatim URIs to their semantic meaning /myservice/api/v1/grandparents/{grandparentID}/parents/children?search={text} This says for the parents of grandparents, find their child having search={text} What you said with your URI is only coherent if searching for a grandparent's siblings. With your "grandparents, parents, children" you found a "grandparent" went up a generation to their parents and then came back down to the "grandparent" generation by looking at the parents' children.

/myservice/api/v1/parents/{parentID}/children?search={text} This says that for the parent identified by {parentID}, find their child having ?search={text} This is closer to correct to what you are wanting, and represents a parent->child relationship that can likely be used to model your entire API. To model it this way, the burden is placed upon the client to recognize that if they have a "grandparentId", that there is a layer of indirection between the ID they have and the portion of the family graph they are wishing to see. To find a "child" by "grandparentId", you can call your /parents/{parentID}/children service and then foreach child that is returned, search their children for your person identifier.

Implementation of your requirements as URIs If you want to model a more extensible resource identifier that can walk the tree, I can think of several ways you can accomplish that.

1) The first one, I've already alluded to. Represent the graph of "People" as a composite structure. Each person has a reference to the generation above it through its Parents path and to a generation below it through its Children path.

/Persons/Joe/Parents/Mother/Parents would be a way to grab Joe's maternal grandparents.

/Persons/Joe/Parents/Parents would be a way to grab all of Joe's grandparents.

/Persons/Joe/Parents/Parents?id={Joe.GrandparentID} would grab Joe's grandparent having the identifier you have in hand.

and these would all make sense (note that there could be a performance penalty here depending on task by forcing a dfs on the server due to a lack of branch identification in the "Parents/Parents/Parents" pattern.) You also benefit from having the ability to support any arbitrary number of generations. If, for some reason, you desire to look up 8 generations, you could represent this as

/Persons/Joe/Parents/Parents/Parents/Parents/Parents/Parents/Parents/Parents?id={Joe.NotableAncestor}

but this leads into the second dominant option for representing this data: through a path parameter.


2) Use path parameters to "query the hierarchy" You could develop the following structure to help ease the burden on consumers and still have an API that makes sense.

To look back 147 generations, representing this resource identifier with path parameters allows you to do

/Persons/Joe/Parents;generations=147?id={Joe.NotableAncestor}

To locate Joe from his Great Grandparent, you could look down the graph a known number of generations for Joe's Id. /Persons/JoesGreatGrandparent/Children;generations=3?id={Joe.Id}

The major thing of note with these approaches is that without further information in the identifier and request, you should expect that the first URI is retrieving a Person 147 generations up from Joe with the identifier of Joe.NotableAncestor. You should expect the second one to retrieve Joe. Assume that what you actually want is for your calling client to be able to retrieve the entire set of nodes and their relationships between the root Person and the final context of your URI. You could do that with the same URI (with some additional decoration) and setting an Accept of text/vnd.graphviz on your request, which is the IANA registered media type for the .dot graph representation. With that, change the URI to

/Persons/Joe/Parents;generations=147?id={Joe.NotableAncestor.Id}#directed

with an HTTP Request Header Accept: text/vnd.graphviz and you can have clients fairly clearly communicate that they want the directed graph of the generational hierarchy between Joe and 147 generations prior where that 147th ancestral generation contains a person identified as Joe's "Notable Ancestor."

I'm unsure if text/vnd.graphviz has any pre-defined semantics for its fragment;I could find none in a search for instruction. If that media type actually does have pre-defined fragment information, then its semantics should be followed to create a conforming URI. But, if those semantics are not pre-defined, the URI specification states that the semantics of the fragment identifier are unconstrained and instead defined by the server, making this usage valid.


  1. What are query strings really used for, besides "filtering" on your resource? If you go with the first approach, the filter parameter is embedded in the URI itself as a path parameter instead of a query string parameter.

I believe I have already thoroughly beaten this to death, but query strings are not for "filtering" resources. They are for identifying your resource from non-hierarchical data. If you have drilled down your hierarchy with your path by going /person/{id}/children/ and you are wishing to identify a specific child or a specific set of children, you would use some attribute that applies to the set you are identifying and include it inside the query.

Related Topic