What you want is called Microsoft Application Request Routing 2 (ARR). (Maybe the clumsy name is part of why so few people know of its existence?)
Microsoft ARR is a free-of-charge HTTP layer load balancer, implemented as a module for IIS 7+. (ARR itself is gratis, but the Windows Server license is of course required for the underlying OS.)
Since ARR is just a thin shim on top of IIS, it is quite fast and absolutely robust. And administrating ARR will be familiar for you guys, since you're already an IIS shop. ARR just installs itself in the IIS Manager GUI.
For a true high-availability setup, you should combine NLB and ARR, so that NLB keeps the ARR server tier highly available, and ARR keeps the backend web server tier highly available. See Microsoft's docs, and see the long list of documentation at the end of the ARR overview page linked at the top.
The only real downside to ARR is that if you do true high-availability, then you will require at least 2 Windows Server licenses & physical servers. Given that, and given the time it takes to set up, then low-end load balancer appliances like Coyote Point or loadbalancer.org can sometimes be a cost-effective alternative (Or Kemp, Barracuda Networks, or any of the other low-end vendors).
ability to seamlessly take a web server out of the load-balanced mix for maintenance without interrupting users.
That will depend on how session state is handled, i.e. how your backend servers share or not share the "this user is logged in" information.
If the webapp tier is stateless (i.e. placing session state in a shared datastore, fx a shared RAM cache or MSSQL), then you can just pull a server out of the pool. If not, then you can use "sticky sessions" on the load balancer, and remove a backend server from the load balancer pool, and then wait until all users have 'drained off' the server in question.
Willy Tarreau, the author of HAProxy, has a nice overview of load balancing techniques and issues here.
Best Answer
There are couple of ways to achieve HA (high availability) of a Load Balancer - or in that regards any service. Lets assume you have two machines, with IP addresses:
Users connect to an IP, so what you want to do is separate IP from specific box - eg create virtual IP. That IP will be 192.168.100.100.
Now, you can choose HA service which will take care of automatic failover/failback of IP address. Some of the simplest services for unix are (u)carp and keepalived, some of the more complex ones are for example RedHat Cluster Suite or Pacemaker.
Lets take keepalived as an example - two keepalived services - each running on its own box - and they communicate together. That communication is often called heartbeat.
If one keepalived stops responding (either service goes down for whatever reason, or the box bounces or shuts down) - keepalived on other box will notice missed heartbeats, and will presume other node is dead, and take failover actions. That action in our case will be bringing up the floating IP.
Worst case that can happen in this case is the loss of sessions for clients, but they will be able to reconnect. If you want to avoid that, two load balancers have to be able to sync session data between them, and if they can do that, users won't notice anything except maybe broken a short delay.
Another pitfall of this setup is split brain - when both boxes are online but the link is severed, and both boxes bring up the same IP. This is often resolved through some kind of fencing mechanism (SCSI reservation, IPMI restart, smart PDU power cut, ...), or odd number of nodes requiring majority of cluster members to be alive for service to be started.
More complex cluster management software (like Pacemaker) can move whole service (eg.: stop it on one node and start it on another) - and this is the way HA for services like databases can be achieved.
Another possible way - if you are controlling routers near your load balancers, is to utilize ECMP. This approach also enables you to horizontally scale load balancers. This works by each of your two boxes talking BGP to your router(s). Each box has to advertise virtual IP (192.168.100.100) and the the router will load balance traffic via ECMP. If a machine dies, it will stop advertising VIP, which will in turn stop routers from sending traffic to it. Only thing you have to take care of in this setup is to stop advertising IP if the load balancer itself dies.