I'm sorry about the complexity! I'm not an expert on Compute Engine firewalls, but I expect that you're correct about the limitations of the source tags to only work for internal traffic.
The Kubernetes team is aware that coordinating multiple clusters is difficult, and we're beginning to work on solutions, but unfortunately we don't have anything particularly solid and usable for you yet.
In the meantime, there is a hacky way to load balance traffic from one cluster to the other without requiring the Google Cloud Load Balancer or something like haproxy. You can specify the internal IP address of one of the nodes in cluster B (or the IP of a GCE route that directs traffic to one of the nodes in cluster B) in the PublicIPs field of the service that you want to talk to. Then, have cluster A send its requests to that IP on the service's port, and they'll be balanced across all the pods that back the service.
It should work because there's something called a kube-proxy running on each node of the kubernetes cluster, which automatically proxies traffic intended for a service's IP and port to the pods backing the service. As long as the PublicIP is in the service definition, the kube-proxy will balance the traffic for you.
If you stop here, this is only as reliable as the node whose IP you're sending traffic to (but single-node reliability is actually quite high). However, if you want to get really fancy, we can make things a little more reliable, by load balancing from cluster A across all the nodes in cluster B.
To make this work, you would put all of cluster B's nodes' internal IPs (or routes to all the nodes' internal IPs) in your service's PublicIPs field. Then, in cluster A, you could create a separate service with an empty label selector, and populate the endpoints field in it manually when you create it with an (IP, port) pair for each IP in cluster B. The empty label selector prevents the kubernetes infrastructure from overwriting your manually-entered endpoints, and the kube-proxies in cluster A will load balance traffic for the service across cluster B's IPs. This was made possible by PR #2450, if you want more context.
Let me know if you need more help with any of this!
I solved this by upgrading to Kubernetes 1.4.
The 1.4 release included several fixes to keep kubernetes from crashing under out-of-memory conditions. I think this helped reduce the likelihood of hitting this issue, although I'm not convinced that the core issue was fixed (unless the issue was that one of the kube-dns
instances was crashed or non-responsive due to kubernetes system being unstable when a node hit OOM).
Best Answer
There isn't a way to resize to zero from the cloud console (and since the iOS app uses the console I'm guessing it applies there too, although I haven't been able to verify).