I think what annoys me about this question is that you've phrased it and loaded it with "facts" in an attempt to gather a definitive no.
The truth is that you could develop an App Engine app that replicates the features of Facebook, Twitter, or Tumblr. And assuming the app was well written, it would scale to support hundreds of millions of users. The main reason you wouldn't want to (which doesn't seem to be a consideration for you) is the cost of running a service that size on App Engine.
Also, I fail to see how any of the restrictions you've listed would prevent you from developing such an app.
HTTP Request/Response
- Max request size: 10 MB - wrong, raised to 32MB.
- Max response size: 10 MB - wrong, raised to 32MB.
-- if you are developing a social app that frequently needs to deliver pages larger than 10MB you are probably doing it wrong. Also, if you do need to deliver content larger than 32MB you can use the blobstore for files up to 2GB.
- You can't access the file system. (forget about saving uploads to filesystem) - wrong. You can read from the local file system and can upload and read/write file to the blobstore.
-- There is no way that Facebook, Twitter, or Tumblr are just taking user uploads and copying them to a folder. Not an issue.
- All requests must respond within 10 minutes otherwise GAE will throw DeadlineExceededException - wrong. It's 30 seconds actually.
-- If you need longer than 30 seconds to deliver results to a user's request you are probably doing it wrong.
- Each cron job must be executed within 30 seconds - wrong, it's 10 minutes.
-- If you can't divide a lengthy task into 10 minute chunks, A: you're probably doing it wrong and B: you can now move that task to a Backend instance, which doesn't have a time limit on requests.
Cron jobs cannot utilize map reduce - never used map reduce, but I think this requires a citation.
Every GET or POST to another site is aborted after 5 seconds. You can configure it to wait till 10 seconds max. (intermediate servers would be necessary to work with Twitter and Facebook many times) - True.
-- If a user-facing request to an external API is taking longer than 10 seconds it's probably a good idea to tell the user to retry anyway. If it's not a user-facing request you can automatically retry the task until the API responds.
- Client can not connect to GAE through FTP (only HTTP and HTTPS). - True
-- Why is this an issue? Do you think any large-scale company deploys changes via FTP?
- No https for custom domains. Only for your-app-id.appspot.com domains. - True.
-- It's on the roadmap though.
- If you get an influx of users, you get "over quota" error - Half true.
-- If you properly budget your app you will never see an over quota error. The Royal Wedding site was hosted on App Engine and received 32,000 requests per second. No over quota errors. Also, ever seen the fail whale on Twitter, or the over capacity error on Tumblr? That's essentially their over quota error.
Database
- Database behavior is not the same in the local development than in the actual servers. - False
-- If you mean running the datastore on your laptop is slower than running it on App Engine's cluster, then true, otherwise not true at all.
- GQL. Nothing else. - False
-- Most developers use db filters to query the datastore. Plus, you could equally say that MySQL allows "SQL. Nothing else."
- No query can retrieve more than 1000 records (sucks seriously if you want to allow your client to have a one-click-go-offline-now button) - False.
-- The 1000 record limit was lifted a long time ago. Besides, show me any user-facing page on Facebook, Twitter, or Tumblr that requires more than 1000 records to render.
- If you need linear access to a massive amount of records to perform an operation, you are out of luck (Google's systems are massively clustered)
-- I'm not even sure what you're getting at here. Most people regard the speed of Google's massive cluster as a huge advantage of the system.
Memcache values max size is 10 MB. - Actually it's 1MB per memcache entry, same as every other memcache implementation.
Can't do simple text search - True.
-- It's a feature that's on deck. Most large sites don't do their own text search indexing.
- You can't join 2 tables. - True.
-- App Engine developers need to adjust their thinking from single massive multi-join SQL query to several smaller individual queries, or denormalize data so that joins aren't needed.
- Slow (You have to read about how to separate tables using inheritance so that you can search in a table, get the key and then obtain its parent in order to avoid deserialization performance) - ???
-- translation/citation required.
- "Too many indexes" runtime exception - True
-- There is a limit to the number of indexes in a single app. I've only seen academic research applications hit it though.
- An entity can at most have 5000 property values in an index - True
-- So if someone has more than 5000 friends they would need two entities in the friends group.
- Key names of the form
__*__
(start and end with two underscores) are reserved, and should not be used by the application. - True
-- But so what?
- Key names are limited to 500 bytes (UTF-8 encoded, I guess) - True
-- Again, so what? Key names aren't for storing novellas, they're for uniquely identifying an entity.
Language
- python or java or Go (anything else would have to be translated to these languages) - Half true
-- Actually you can also run any language that runs on the JVM, including PHP and JRuby. Not sure why it's an issue though, Python and Java are two powerful languages with lots of available tools, tutorials, and experienced programmers.
Server Issues
- No static IP (Throttling and quota problems calling third party APIs) - Half true
-- Most third party APIs are aware of App Engine and/or have a relationship with Google. A few times Twitter has accidentally blocked App Engine and it gets fixed within a few hours.
- Each application is limited to 3000 files - Half true
-- If you really need more than 3000 code files for your web application you can use zip imports (Also, you might be doing it wrong).
- No control of OS or hardware running the web app - True
-- App Engine is a Platform as a Service. Not having to worry about servicing the OS or hardware is what people are paying for. This is the key advantage of App Engine, not a limitation.
Best Answer
You should do both:
Start with hosting from a CDN such as Google's because it will likely have a higher up-time than your own site and will be configured for the fastest response time. Additionally, anyone who has visited a page that links to the CDN will use their cached copy of the file, so they won't even have to re-download a copy, making the initial loading even faster.
Then add a fallback reference to your own server in case the CDN happens to be down (not likely, but safe is safe). Fallbacks are relatively easy to understand, but need to be customized to suit the script being used:
Make sure you don't write
</script>
anywhere within a<script>
element, as it will close the HTML element and cause the script to fail. The simple fix is to use a backslash as an escape:<\/script>
.One more reason to do both:
If you pick a popular CDN it's highly unlikely that it'll ever have any down-time, however in the far far future (~18 months from now given Moore's law) when the hosting format changes, or the address is adjusted, or the network is placed behind a paywall, or anything else, it's possible that your link will no longer work as-is. If you use a fallback, then it'll give you a bit of time to adjust to any new format for hosting before having to go back through every website you've ever created and change the CDN links.
another reason to do both:
Recently I've been hit with a string of internet outages. I was able to keep working locally on projects where I'd linked local copies of script resources, and I quickly found that there were a number of projects that needed to have local copies linked.