Went through the GCP support channel and they were able to get my cluster back up and running. According to the support rep, this is not always necessary:
Please note that it's not the expected behavior for the clusters to be in a problematic state after a project restore. It can happen from time-to-time though, but are rare occurrences.
So that did it for me, not sure what you would need to do if you didn't have a GCP support package though.
I saw few blogs talking of a GKE feature to create a clone of existing
GKE Cluster but I cannot find any option in GCP Console to create new
cluster by cloning an existing GKE Cluster.
As to creating a new GKE cluster from the existing one as its clone, it looks like this option is still available but was moved to differnt section. Now it is available in cluster details (when you click on your cluster name) and is called DUPLICATE
:
Creating cluster from templates in general has been removed from cloud console. You can read about it here:
GKE previously supported templates for clusters. Those templates were
removed from Google Cloud Console ...
When we talk about backing up kubernetes cluster we need to keep in mind the basic distinction. One thing is backing up the cluster itself and another one is backing up it's workloads or resources deployed to it.
As for now there is no single tool which enables us to perform both operations altogether, at least not on managed kubernetes solutions such as GKE. Of course there are completely different possibilities of performing full backup of your cluster when it comes to on-premise kubernetes installation e.g. disk snapshots.
Velero (formerly known as Heptio Ark) is a great tool that enables you to back up and restore your Kubernetes cluster resources as well as persistent volumes. And it can be used with any public cloud provider or on-premises k8s installation.
However as you can read in Cluster Migration description there is one caveat when migrating persistent volumes between different cloud providers:
Velero can help you port your resources from one cluster to another,
as long as you point each Velero instance to the same cloud object
storage location. This scenario assumes that your clusters are hosted
by the same cloud provider. Note that Velero does not natively
support the migration of persistent volumes snapshots across cloud
providers. If you would like to migrate volume data between cloud
platforms, please enable
restic, which will backup
volume contents at the filesystem level.
As you can see it still can be done with the help of restic. However if you migrate workloads deployed on GKE to another GKE cluster, you don't need it.
As to backing up or cloning an existing GKE cluster (cluster itself, not its workload), an interesting approach is to save it as a code which can be used to easily re-create it later. You can use Infrastructure as Code tool such as terraform and it's import option.
Best Answer
As OP mentioned in comment, there is possibility to start/stop GKE cluster using resize command from gcloud. However as new versions and features comes out, this command needs to be tuned.
In current default version (
1.15.12-gke.2
) GKE is using Node Pools. It allows you to have a fewnode pools
and each of them can have different Image type, Machine configuration, disk size, etc.Due to this, while you are resizing cluster you also need specify which
node pool
you want to resize.When you use this command you can increase or decrease numbers of nodes in your
node pool
. You don't need to worry that if you will resize yournode pools
to0
you will delete/loss data in cluster, asMaster
is managed by google and when you will resize upnode pool
all configuration and deployed resources will be still there.However there is another solution, to use Cluster Autoscaler.