Your confusion is reasonable - they are often the same thing. But not always. When you refer to a load balancer you are referring to a very specific thing - a server or device that balances inbound requests across two or more web servers to spread the load. A reverse proxy, however, typically has any number of features:
load balancing: as discussed above
caching: it can cache content from the web server(s) behind it and thereby reduce the load on the web server(s) and return some static content back to the requester without having to get the data from the web server(s)
security: it can protect the web server(s) by preventing direct access from the internet; it might do this through simple means by just obfuscating the web server(s) or it may have some more active components that actually review inbound requests looking for malicious code
SSL acceleration: when SSL is used; it may serve as a termination point for those SSL sessions so that the workload of dealing with the encryption is offloaded from the web server(s)
I think this covers most of it but there are probably a few other features I've missed. Certainly it isn't uncommon to see a device or piece of software marketed as a load balancer/reverse proxy because the features are so commonly bundled together.
I'm sorry about the complexity! I'm not an expert on Compute Engine firewalls, but I expect that you're correct about the limitations of the source tags to only work for internal traffic.
The Kubernetes team is aware that coordinating multiple clusters is difficult, and we're beginning to work on solutions, but unfortunately we don't have anything particularly solid and usable for you yet.
In the meantime, there is a hacky way to load balance traffic from one cluster to the other without requiring the Google Cloud Load Balancer or something like haproxy. You can specify the internal IP address of one of the nodes in cluster B (or the IP of a GCE route that directs traffic to one of the nodes in cluster B) in the PublicIPs field of the service that you want to talk to. Then, have cluster A send its requests to that IP on the service's port, and they'll be balanced across all the pods that back the service.
It should work because there's something called a kube-proxy running on each node of the kubernetes cluster, which automatically proxies traffic intended for a service's IP and port to the pods backing the service. As long as the PublicIP is in the service definition, the kube-proxy will balance the traffic for you.
If you stop here, this is only as reliable as the node whose IP you're sending traffic to (but single-node reliability is actually quite high). However, if you want to get really fancy, we can make things a little more reliable, by load balancing from cluster A across all the nodes in cluster B.
To make this work, you would put all of cluster B's nodes' internal IPs (or routes to all the nodes' internal IPs) in your service's PublicIPs field. Then, in cluster A, you could create a separate service with an empty label selector, and populate the endpoints field in it manually when you create it with an (IP, port) pair for each IP in cluster B. The empty label selector prevents the kubernetes infrastructure from overwriting your manually-entered endpoints, and the kube-proxies in cluster A will load balance traffic for the service across cluster B's IPs. This was made possible by PR #2450, if you want more context.
Let me know if you need more help with any of this!
Best Answer
Google Cloud seems to support now custom request and response headers for HTTP(S) Load Balancers. I've added a custom
Strict-Transport-Security
response header for our backend and it works as expected.In the given example we use a backend bucket, however the custom header option is available for other backend types too.