Fastest way to terminate an EC2 instance

amazon ec2amazon-web-servicesboto

I have a requirement to be able to terminate EC2 instances in under a minute.

The current process takes just under 2 minutes per instance because the OS shutdown process takes 60 seconds. I want to speed up terminations considerably, if possible.

Does anyone know of a way to speed up the terminate() function in EC2? Is there a way to "pull the plug" without a shutdown process as other virtualization solutions do?

Background:
In Boto, I call the terminate() function with the wait_until_terminated() function before handling subnet deletion or other follow-up tasks.

But, I am triggering boto from a 3rd party API that times out if processes (like terminations) take longer than a minute. That means every time I terminate, the API returns errors.

I have tried to work with the 3rd party to increase the timeout, but things like terminations are not in their expected use cases, and as of right now, there is no solution from the 3rd party.

I tried a stop(Force=True) and it is a little faster, but still over a minute.

I tried to forcibly remove the EBS volume, but you have to shutdown the instance first, which brings the process over the 1 minute mark.

I tried SSH'ing in to run various shutdown and halt command arguments, but I cannot find an OS command that runs faster than 60 seconds. The running services are already at a minimum, and I cannot speed up the OS shutdown any further.

I'm hoping to find a way to "pull the plug" via AWS. Or some other method to quickly terminate. It seems like terminations require an OS shutdown, which is a little odd to me when I want to torch the instance anyway.

Best Answer

While I agree this is an XY problem for sure and you should address the problem in another way, there are far faster ways of doing an OS shutdown than using shutdown. There is no reason to wait for Linux to call init scripts and issue TERM and KILL to all processes.

Historically, I believe killall -9 init or a magic SysRq key was the quickest way. However, systemd lists many ways (man systemd), for example:

   SIGRTMIN+13
       Immediately halts the machine.

   SIGRTMIN+14
       Immediately powers off the machine.

You'll probably have to test a few options before finding the one that AWS reacts to fastest, but going from 60 seconds OS shutdown to 1-5 seconds should be simple enough.

Related Topic