GCP Cloud SQL postgres in bad state after maintenance

google-cloud-platformgoogle-cloud-sql

There was what looks like a planned maintenance in our HA managed postgres cloud sql instance and it's in a bad state since.

Jan 7, 2018, 2:08:21 AM Update An unknown error occurred.

The failover instance did not work "by failing over", now even do not exist anymore, we can't restart or perform any other operation, this production database is totally unavailable

2018/01/08 09:41:24 couldn't connect to "ourprojectid:us-central1:instance name": googleapi: Error 409: The instance or operation is not in an appropriate state to handle the request., invalidState

We also tried to contact the support by directly sending email as suggested in similar posted issues.

https://stackoverflow.com/questions/42719547/cloud-sql-instances-are-not-starting-or-restarting-its-stuck

We start considering to create a new instance and restore from a backup, but I would expect some more resiliency from an HA managed instance and upon an schedule maintenance, this is being out for more than a day.

Thanks in advance

Best Answer

Firstly, please do not share your GCP project ID or CloudSQL Instance information on a Community thread as this. Reach out to GCP Support Engineers directly if you require such review on your CloudSQL instance

As the error suggests, It is either an Operation is stuck or the CloudSQL instance is stuck on the error. There are a bunch of reasons why this error may be obtained, which includes:

  1. Trying to reuse an Instance name within a week after the Instance was deleted. Similar issue reported here

  2. If an Operation is indeed stuck. This would require the GCP Support Engineers to stop the stuck operation.

  3. The Instance for whatever reasons, including other Internal or underlying issue, may also become unhealthy or unavailable. GCP Engineers will also be able to help on this case.

Generally, recreating the CloudSQL instance and restoring backups, as you have rightly done, would be helpful to avoid dealing with the issue.

Related Topic