Amazon's Elastic Load Balancer (ELB) distributes requests to healthy instances that have been assigned to it. It does not restart or modify those instances (or their number). It determines 'healthy' instances via a health check - typically polling a given location.
What you are asking for is to 'maintain 1 healthy instance' - that is an auto scaling task. Auto scaling will allow you to define a group of instances (typically including an AMI to launch, the instance type, one or more availability zones to launch the instances in, and the number of instances to maintain (minimum/maximum)), as well as policies by which to scale up and down with. An autoscaling policy returns an ARN (Amazon Resource Name - a reference to a resource).
Once you have your auto scaling group setup, all you need to do is trigger your scaling policies when an instance becomes unhealthy. If you look closely at the health check that you setup with ELB, you will notice that you can setup an alarm - and that alarm is actually a Cloudwatch alarm.
You can setup your own Cloudwatch alarms, or set them up through ELB's health check - just specify the --alarm-actions
to trigger the auto-scaling ARN when your unhealthy node criteria is met.
ELB isn't technically required in this setup - auto scaling will do the job on its own. What ELB does do for you is provide a DNS address you can access your instance(s) by (and also some sort of error message when a backend is unavailable). (With auto scaling on its own, you would need to re-associate your elastic IP with the new instance when it launches (which can be scripted)).
Finally, just to clarify:
CloudFlare is not an AWS service - it is a CDN (and is somewhat well known for mitigating DDoS attacks). Amazon's equivalent service is CloudFront - you don't need either of them for restarting instances. What you do need is CloudWatch - Amazon's monitoring service). The free tier does cover both Cloudwatch and a few alarms.
Here are two ways to solve this;
First option is to add another health check on the host that validates the health and returns HTTP 200s to the ELB if the logic says that you want to keep the host online. The logic there is, of course, up to you. The disadvantage here would be that if App 2 deployed successfully on some hosts all hosts would still be 'healthy' and receiving traffic.
Another option is to use an additional ELB for each application. You can point several ELBs to the same backend EC2 instances and the cost is pretty minor to do so. That way you can health check per application and drop hosts with issues at a per-application level rather than an all-or-nothing approach.
Edit: Please note this is an older answer and is specific to ELB not ALB. ALB supports separate targets on one host natively.
Best Answer
A single ELB routes traffic to exactly one set of instances, and distributes the incoming traffic to all the instances "behind" it. It does not selectively route traffic based on any layer 7 analysis of the traffic, such as the
Host:
header.You need one ELB for each set of instances. As you describe it, that's one ELB for each webapp.
If your primary purpose for running ELB is offloading the SSL using a wildcard certificate (I have one system designed like this, with dozens of apps living at many-different-domains.my-wildcard-cert-domain.com), then the instances "behind" the ELB could be running a reverse proxy such as HAProxy (or several other alternatives, like Varnish) that can make layer-7 routing decisions and then forward the traffic to the appropriate subset of machines behind them, which also allows more sophisticated load balancing and has the advantage of providing you with stats and traffic counters, aggregate and separate.
The intermediate ^^^^ instances can evaluate the
Host:
headers (among other things) and even capture the value of the session cookie in their logs for analysis.This setup also allows me to run multiple apps on overlapping subsets of instances, where appropriate, and do a lot of other things that ELB by itself doesn't directly support. It also returns a custom "503" page in the case where an application gets overloaded or otherwise becomes unavailable, which ELB does not do on its own. I've depicted 2 proxy servers here, for no particular reason other than your mention of the number 2 in the question. My setup actually has 3, one for each availability zone in the region where this is deployed.