I am making an image hosting service. This is my first time building a large site, so I don't have much knowledge about creating a reliable web service, but I THINK I have this figured out.
How should I set up my site hosting to accommodate large amounts of traffic? This is what I was thinking. Tell me if this is a good idea:
- Pick any cheap provider that has php and mySQL.
- Store only the back end stuff on that cheap provider (php scripts, server config files, sql database).
- Use Amazon S3 to store all the front end things, like css, js, images, and of course, all the stored images that the users upload (this is an image hosting website).
Does that work? That means that the cost from large amounts of traffic is all done through Amazon S3, right? The cheap provider shouldn't get hit with any significant costs because all it's doing is running scripts and updating the database? Or will that also add up (and run slowly)?
Should I move the database to Amazon SimpleDB? I also hear I can run the site using Amazon EC2, but it looks like that takes a lot of work to set up (and is expensive). I guess what I'm asking can be summarized as: what's the most cost-effective way to reliably run an image-hosting website?
Thanks.
Best Answer
So, here are a few points that may help you out.
Depending on your upload configuration (Client Side Client vs Server Side Client) your needs will be different. Client Side will be cheaper up front for server costs, but be aware that someone will probably find a way to store any kind of file and you will be responsible for moderating that content. For the Server Side model, be prepared to have your server costs increase with user traffic as you will need to build out more servers to handle upload requests.
Once you have the content hosted you will also want to look into a CDN (Content Delivery Network) such as Amazon's CloudFront (if you want to stay on the Amazon stack) or Akamai Networks. These will increase your costs at first, but save you money on high usage content.
Amazon SimpleDB is an interesting Database style. It is 'eventually consistent' which means that data sent to the database may not be immediately accessible, similar to Amazon S3. If you are going use the database as a way to keep data synced across multiple nodes for many realtime transactions, I would not recommend it.