Java/Postgres – How to Retry a Failed REST API Request

api-designawspostgresrest

We have a REST API which calls a third-party REST API to Send Emails. The Third Party API is not super reliable and randomly fails every now and then with a 500.

Our Clients do not want to retry at all and instead requested us to build a retry mechanism for failed emails.

We are using Spring-Retry to implement Retry and Circuit Breaker Pattern where in Fallback method we are storing failed request somewhere (DB/File still an open question).

We have a scheduled job that will run every hour, pick up all the failures where initial retries were exhausted and try to re-send emails.

My question is on if there are any best practices on how do we store the failed request:

  1. Shall we store the request as is with Body, URL, and Headers in a blob/text in db so it is easier for the Scheduled Service to Resend it,
  2. Shall we write the failed request to a file somewhere maybe S3 and resend it
  3. Shall we reconstruct the API request from scratch using all the data passed to us by the client and stored in the database already in different tables (acc numbers, usernames, urls) plus fetching API Keys and reconstruction of URLs.

We are leaning towards option 3, there is more development work involved, but we already have all the data stored and can use it to reconstruct whole request. Is there anything I am missing here or any best practices or design pattern I can leverage?

Best Answer

The best way with emails is not to have an API attempt to send them. Sending emails is a slow process and not a suitable task for a website.

Instead have the API persist the send email request to a database, split into its various fields, not as a blob.

Then have a worker process pick up new jobs from the database and attempt to send them. If the send fails, the worker process can automatically pick up the job again on its next run through.

A more advanced setup would replace the database with message queues but it's easier to explain with a database.

You can see how this setup makes it easy to handle the various failure scenarios, you can take all sorts of action including retrying, reporting back to the client after X amount of time, reporting on incorrect email addresses etc etc