HTTP Status Codes – What Code to Return for Multiple Actions with Different Statuses

apihttp

I am building an API where the user can ask the server to perform multiple actions in one HTTP request. The result is returned as a JSON array, with one entry per action.

Each of these actions might fail or succeed independently of each other. For instance, the first action might succeed, the input to the second action might be poorly formatted and fail to validate and the third action might cause an unexpected error.

If there was one request per action, I would return status codes 200, 422 and 500 respectively. But now when there is only one request, what status code should I return?

Some options:

Always return 200, and give more detailed information in the body.
Maybe follow the above rule only when there is more than one action in the request?
Maybe return 200 if all requests succeed, otherwise 500 (or some other code)?
Just use one request per action, and accept the extra overhead.
Something completely different?

Best Answer

The short, direct answer

Since the request speaks of executing the list of tasks (tasks are the resource that we're speaking of here), then if the task group has been moved forward to execution (that is, regardless of execution result), then it would be sensible that the response status will be 200 OK. Otherwise, if there was a problem that would prevent execution of the task group, such as failing validation of the task objects, or some required service isn't available for example, then the response status should denote that error. Past that, when execution of the tasks commences, seeing as the tasks to perform are listed in the request body, then I would expect that the execution results will be listed in the response body.

The long, philosophical answer

I suspect that you are experiencing this dilemma because you are diverting from what HTTP was designed for. I suspect that you are attempting to use it as means of RMI (Remote Method Invocation) rather than as means to manage resources.

The RMI perspective is that you would design your URI scheme as you would functions in an application, and upon request, these would execute an action, and then return its result. Although these types of implementations are relatively common still, these often produce situations HTTP is in the way, rather than making things easy.

(Just to note, RMI through HTTP has its merits in some instances, though you'd still be wise to implement these methods in a non-blocking manner. You could maybe offer startTask and getTaskStatus for instance, both of which would return instantly.)

The design of HTTP is asking you to use it to manage resources instead. It wants to express things like "add a task" (via POST), "get a task" (via GET), "delete a task" (via DELETE) and so on. Designing our URI scheme in that way, we rarely find ourselves in conflict with what HTTP has to offer.

To provide an example of what I mean by a URI scheme that conforms to resource management (vs RMI), here's a layout that might work for your case:

/task?complete=[true/false]&start=[start_timestamp]&end=[end_timestamp] ...
- GET searches for tasks according to querystring
- POST adds a single task
/task/[id]
- GET responds with a single task's state object
/task/[id]/cancellation_request
- POST adds a cancellation request for the task.
/task/[id]/[property_name]
- GET returns the value of the property of a task of the specified id
/task_group?complete=[true/false]&start=[start_timestamp]&end=[end_timestamp] ...
- GET searches for task groups according to querystring
- POST adds a group of tasks
/task_group/[id]
- GET responds with a task group object, which includes a list of task objects of all of the tasks in the group.

... and so on

As you may have guessed, task execution in this scheme would be an asynchronous thing -- POST to /task would not wait until the task has completed, or even until it actually started running -- It would simply queue it for execution and then respond that it succeeded to add that task, or that it failed, if the queue is full, for instance.

Note how the URIs have no verbs in them -- They represent resources or collections of resources. The only verbs in this entire scheme are the HTTP methods that are invoked upon these URIs (GET/POST in this case).

Just to hammer this in a little more, URI stands for "Unified Resource Identifier".

Examples of how the above URI scheme would be used

Executing a single task and tracking progress:

POST /task with the task to execute
GET /task/[id] until response object complete has positive value while showing current status/progress. You can also implement updates with websocket if you want to avoid polling.

Executing a task group and tracking progress:

POST /task_group with the group of tasks to execute
GET /task_group/[groupId] until response object complete property has positive value, showing individual task status (3 tasks completed out of 5, for example)

Regarding the example requests

/GoalTree/GetByDate?versionDate=...
/GoalTree/GetById?versionId=...

For the format, you said, you always return the nearest revision to that date. It will never not return an object, so it should always be returning 200 OK. Even if this were able to take a date range, and the logic were to return all objects within that timeframe returning 200 OK - 0 Results is ok, as that is what the request was for - the set of things that met that criteria.

However, the latter is different as you are asking for a specific object, presumably unique, with that identity. Returning 200 OK in this case is wrong as the requested resource doesn't exist and is not found.

Regarding choosing status codes

2xx codes Tell a User Agent (UA) that it did the right thing, the request worked. It can keep doing this in the future.
3xx codes Tell a UA what you asked probably used to work, but that thing is now elsewhere. In future the UA might consider just going to the redirect.
4xx codes Tell a UA it did something wrong, the request it constructed isn't proper and shouldn't try it again, without at least some modification.
5xx codes Tell a UA the server is broken somehow. But hey that query could work in the future, so there is no reason not to try it again. (except for 501, which is more of a 400 issue).

You mentioned in a comment using a 5xx code, but your system is working. It was asked a query that doesn't work and needs to communicate that to the UA. No matter how you slice it, this is 4xx territory.

Consider an alien querying our solar system

Alien: Computer, please tell me all planets that humans inhabit.

Computer: 1 result found. Earth

Alien: Computer, please tell me about Earth.

Computer: Earth - Mostly Harmless.

Alien: Computer, please tell me about all planets humans inhabit, outside the asteroid belt.

Computer: 0 results found.

Alien: Computer, please destroy Earth.

Computer: 200 OK.

Alien: Computer, please tell me about Earth.

Computer: 404 - Not Found

Alien: Computer, please tell me all planets that humans inhabit.

Computer: 0 results found.

Alien: Victory for the mighty Irken Empire!

REST – Do Web Applications Use HTTP as a Transport Layer?

404 means "the resource you asked for doesn't exist." It's up to the server to decide when that response is appropriate. Google has apparently interpreted it to mean "your API request was malformed." Other REST APIs might also interpret it to mean "your request was well-formed, but you are asking for something which does not exist." This is why you need to read the API docs.

The "server," in turn, is (as your RFC quote indicates) anything that responds to HTTP requests appropriately. If you build a server out of a gigantic web application framework, that's your business. The only requirement is that the client gets the semantically-correct response in any given situation. It will often be the case that the web application is better suited to make that decision than (say) Apache's out-of-the-box behavior, but you can and should set this up in whatever way makes the most sense for your situation.