I have had some perspective on this in the last few months & I believe these items to watch will address all the concerns above:
1) The comment from @Ross on the original posting is the key. T2 instances, no matter what scale and no matter whether they are EC2 or RDS, will stop performing when their CPU credits run out as the peak CPU demands continue.
2) The failure mode of a CMS web server we have seen most often is shown exactly by this condition: the CloudWatch graph dives towards zero when the CPU percentage needed by httpd
processes exceeds the CPU percentage assigned to that instance type (see doc link below).
3) The quick solution for a T2 instance that has exhausted CPU credits is to shut down, upgrade the instance type, and start up the instance again, which takes about 3-4 minutes. The most vital description of the capacities of different instance types is here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/t2-instances.html
4) Any production web server on AWS must have an Elastic IP address assigned in advance for this reason: if not, and the instance is rescaled, the IP address will change, leaving the web server inaccessible far beyond what would otherwise only be 3-4 minutes of downtime.
5) The only way to acquire more CPU credits is to upgrade the machine type. The amount of credits each T2 instance size can hold is described in the doc link above: it is always equal to the CPU work that instance type would do in 24 hours.
6) The machine can be returned to its original scale during a bit of scheduled downtime (again, 3-4 minutes) after peak performance demands die down.
7) I/O activity hasn't caused any performance degradation for our web server in any peak periods so far. The amount of IOPS is determined strictly by EBS volume size. Both the exact meaning of IOPS, and that relationship, are described here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-io-characteristics.html
8) Neither of the Cloud Watch metrics Freeable Memory nor DB Connections were of any use anticipating or correcting performance problems in a web server intensive environment.
Use the SQS solution.
When your Lambda function executes, store the request into SQS.
Use Auto Scaling to launch and terminate your EC2 instance as requests are added to the queue.
On your EC2 instance, poll the SQS queue for work. You can minimize SQS costs by minimizing requests:
- Use long polling, and/or
- Check the queue once each minute, sleeping in between checks.
SQS requests are not that expensive. Polling once a minute would yield 43,200 requests a month. This is well below the 1 million requests you get for free each month. Even if not covered by the free tier, the first million requests are only $0.50.
Best Answer
You can send the CloudWatch events to SNS Topic or SQS Queue and have a subscriber somewhere that will do the API call when the event happens. However you still need to host the subscriber somewhere, give it the permissions to subscribe to the topic / queue, etc.
It would be much easier to use Lambda with a simple IAM Role. Can you elaborate why it isn't an option? Why would you not use the best suited tool for the job at hand?