My problem occured when using Redis on Kubernetes, but it seems that it is not a problem with Redis itself, but with network/infrastructure.
My scenario:
- I have a Redis Service with single Redis Pod serving it.
- I connect Redis Client to the Service.
- I delete a Redis Pod.
- Client connection gets ended.
- Redis Client tries to reconnect.
- In this time Redis Replica Ret brings up a new Redis Pod, and the Redis Service starts responding to requests/creating new connections.
- However my existing Redis Client is hanged on connection (first reconnect
attempt) and it stays that way until it gets timeout (which is
approximately after 130 seconds). - On the second reconnect attepmt it gets connected immediately.
The problem seems to not exists on my dev env (local docker containers), because timeout shows up after a second or 2.
Also, the client that I am using has no option to configure a socket timeout.
- Is this a proper behavior of a Service – hanging a connection until a timeout occurs whend there are no Pods to handle requests? If it responded with error immediately, there would be no such problem.
- Is there a way to configure this timeout to acceptable value somewhere (on Service level, on some network configuration, etc.)? Let's say 5 seconds would be ok.
Best Answer
A Service is an abstraction in Kubernetes which defines a logical set of Pods and a policy by which to access them. Technically, a Service is a type of Kubernetes resource that causes a proxy to be configured to forward requests to a set of pods.
In your case, your Redis Client connects directly to your Redis Pod using some firewall rules configured in a Service. Therefore, if the Service was untouched, connection problems may appear only on the Pod side or on the Client side. So, you need to look for configuration settings for them.