You might reconsider your Windows configuration. I have used direct routing with LVS successfully in Windows. As per the documentation a member of my team wrote:
First install the Windows Loopback Adapter. Start > hdwwiz.exe
Click Next then "Install the hardware that I manually select from a
list (Advanced)
Scroll Down and click "Network Adapters"
Choose Microsoft, then Microsoft Loopback Adapter
Finish the Wizzard
Go to Control Panel\Network and Internet\Network Connections. Rename
the adapters to their descriptive
names.
Right click on the loopback adapter and manually assign it the LVS
VIP.
Go to Start > cmd.exe (right click and choose run as administrator)
Run these Commands.
netsh interface ipv4 set interface "Name of Adapter that holds the real
host IP" weakhostreceive=enabled
netsh interface ipv4 set interface "loopback" weakhostreceive=enabled
netsh interface ipv4 set interface "loopback" weakhostsend=enabled
This was a Windows 2008 server, which was configured initially using this Web site for guidance.
As far as logging goes, often the only solution will be to utilize the logging at the point in which the client's real IP is still in the route.
With Web traffic, the X_FORWARDED_FOR
environment variable could be used. Point being, after a certain point, the network layer cannot be relied on for this information. In that case, you have to move further up the stack for potential solutions.
The canonical solution to this is to not rely on end user IP address, but instead use a Layer 7 (HTTP/HTTPS) load balancer with "Sticky Sessions" via a cookie.
Sticky sessions means the load balancer will always direct a given client to the same backend server. Via cookie means the load balancer (which is itself a fully capable HTTP device) inserts a cookie (which the load balancer creates and manages automagically) to remember which backend server a given HTTP connection should use.
The main downside to sticky sessions is that beckend server load can become somewhat un-even. The load balancer can only distribute load fairly when new connections are made, but given that existing connections may be long-lived in your scenario, then in some time periods load will not be distributed entirely fairly.
Just about every Layer 7 load balancer should be able to do this. On Unix/Linux, some common examples are nginx, HAProxy, Apsis Pound, Apache 2.2 with mod_proxy, and many more. On Windows 2008+ there is Microsoft Application Request Routing. As appliances, Coyote Point, loadbalancer.org, Kemp and Barracuda are common in the low-end space; and F5, Citrix NetScaler and others in high-end.
Willy Tarreau, the author of HAProxy, has a nice overview of load balancing techniques here.
About the DNS Round Robin:
Our intent was for the Round Robin DNS TTL value for our api.company.com (which we've set at 1 hour) to be honored by the downstream caching nameservers, OS caching layers, and client application layers.
It will not be. And DNS Round Robin isn't a good fit for load balancing. And if nothing else convinces you, keep in mind that modern clients may prefer one host over all others due to longest prefix match pinning, so if the mobile client changes IP address, it may choose to switch to another RR host.
Basically, it's okay to use DNS round robin as a coarse-grained load distribution, by pointing 2 or more RR records to highly available IP addresses, handled by real load balancers in active/passive or active/active HA. And if that's what you're doing, then you might as well serve those DNS RR records with long Time To Live values, since the associated IP addresses are highly available already.
Best Answer
I have rolled out a hybrid IPVS/HAProxy setup. HAProxy was used to do some fairly heavy L7 decision-making, which made it necessary to scale it out at relatively low traffic volumes. Putting IPVS in front gave the ability to do scale-out of the HAProxy nodes, as well as remove the need to manage failover at that layer. It worked fine, for the specific use-case I needed.
I wouldn't recommend this setup for your stated situation. By having both in the mix, you'll remove the reasons for going for IPVS in the first place, because as long as HAProxy is in the stack somewhere, it'll behave the same as it does now. Any problems HAProxy is having with long-lived TCP connections will still exist (because the TCP connections are still going through a HAProxy instance), you'll only be able to do DSR from the HAProxy box out to the Internet, and when a HAProxy box goes down, you'll still lose all the connections that were going through that instance.
If you don't need the specific features that HAProxy gives you (L7 intelligence), then just use IPVS (for the benefits you stated you want). If you do need the specific features that HAProxy gives you, then use it instead of IPVS. Yes, it's a trade-off. You'll need to decide which is more important, and which set of missing features you can more easily engineer around (for instance, by moving some intelligence to the backend, or doing a better job of dealing with dropped connections and re-establishing without user-visible impact).
Only if you need the features of HAProxy, and you need to scale out HAProxy because you have a situation in which a single HAProxy box won't work but a single IPVS DSR box will, should you then use both in combination.