Why do cloud providers calculate per hour? Who turns off their servers anyway

amazon ec2cloudcloud computing

I'm not tech/unix/sysadmin ignorant, but i still can't understand why Amazon, Rackspace, Azure, GoGrid, Linode and these guys calculate their instances per hour used…

I have to ask… Who on earth turns off their servers?

*aaS started as a bundle/plans strategy where you pay for a bucket of services and you didn't have to worry about the usage… pay 100$ a month and you get this, this and that.

Now when i want to make the switch to Amazon EC2, i get confused by usage percentage, transfer calculations… it's too damn hard and time consuming.
In my calculations so far, i can say that ordering 2x dedicated servers with 24GB RAMs and installing ESXi and managing the whole thing is much much much cheaper…

Am i missing something?

Best Answer

Take a scalable infrastructure like the one I'm working with. In my case, the amount of work we can crank out scales well with the number of processing nodes we have running. We have some in-house capacity and are working to use exactly these kinds of services for demand that exceeds our in-house capacity.

When we need it, we deploy a bunch of processing nodes to a Cloud service like that. Once we have deployment automated, it should be a matter of telling the system "I need 20 new nodes", having it spin up 20 new instances, tweak names as needed, and start chewing away. Once the project is done, we turn off those nodes and go on our merry way.

Due to the cost-factors involved (in our case, it won't be for everyone), if this happens often enough it's a good sign that we need to scale our internal infrastructure out a bit more.

There will certainly be a "base load" that we keep running all the time, and for that we like to host that in-house. We'll probably need some always-on instances in the utility-cloud for certain application and data-locality reasons, but those should be in the single digits. Those few servers should be able to support up to hundreds of short-duration processing nodes.

In months where we never need to use the 'surge' capability, we'll still have to pay for the base-load servers we need to keep running over there. However, for other months when we have more work than we know what to do with, we could have hundreds of machines going at any given time.


As for calculations, I've done just that. It requires a spreadsheet and knowing your environment very well. I knew how much data we generally take in a month (the transfer-in bandwidth), how many times the data is read as part of our processing methods (the storage transfer rates), what the growth factor is for processed data versus source data (data storage costs), as well as an estimate as to what percentage during a month are we actually doing work (the hours used). If you don't have that, it's hard to estimate billing accurately.

I was able to take what we know about our private environment and build a sheet that predicts what something like AWS would cost versus rolling our own in a colo. It was very informative. In our specific case the cloud vendor we were considering represented anywhere from a 2x to 10x increase in costs versus doing it all ourselves. This was highly useful for upper management, which had been considering going all-in with that cloud vendor.

We ended up going with a hybrid approach, since the surge capability the cloud represents is highly useful. It sucks to tell clients we can't meet their deadlines because we've taken on too much work. If anything, the cloud capability can tide us over until we can get infrastructure upgrades in place for our in-house plant.

Related Topic