First of all, could you provide me with some clarification: are you coding a chat room (one-to-many) or individual (one-to-one) chats?
There are a couple of obvious flaws in your current system's design that I would like to point out. First, I'll start out with a brief analysis and then explain what is wrong with it and what you can do to fix it. Obviously if you want to run a successful project you should start from step one and get your hands dirty with systems analysis and design.
Problem
Website receives a high volume of traffic and eventually crashes.
Requirements
- Sustain high volume of web traffic
- Display previous 150 messages
- Secure communication pathway between clients
Problem Analysis
Right away it is obvious that your site is crashing because the code that opens the file is being called hundreds of thousands of times per second. Opening files and writing to them is very memory intensive.
The FILE Probem: Files do not handle concurrency very well at all. In fact, they're terrible with concurrency. Think about it like this, essentially you've opened the same file with notepad with hundreds of thousands of open windows/processes and you're changing the content in all of them simultaneously. When you try to solve a problem with this type of solution you end up with non-deterministic results. Basically, it is impossible to predict what data will be in the file.
Fortunately, there is a way to get deterministic results while still using files if you lock them properly. Unfortunately, this is not a solution to your problem. In your case, only one person would be able to send a message at a time. Surely that is NOT the solution you want!
Wait... there IS a solution:
You CAN USE a Database!
Databases are particularly good at solving this sort of concurrency issue! Depending on what database/engine you use your table may lock or only a single record might lock. In you case, I would suggest a free database like MySQL and a record-locking engine like InnoDB. If you're not a database rookie, you might want to look into MariaDB as well, it is a fork of the MySQL project by the original developer and is a binary drop in.
Basically, there is no way around it using a database for this type of solution. In fact, databases are very powerful and you can program procedures with them. From your query you can choose to select only 150 messages and then order them by most recent very easily. All users will be able to send messages at the same time with a record-locking database engine like InnoDB.
I would like to additionally point out that I would be a little troubled to find out what the code for the rest of the application looks like. It is very easy to write PHP code that looks fine but performs terribly. I'm not sure if you're familiar with asymptotic analysis or unit testing but I highly suggest that you thoroughly test your code before pushing it to production. Given the size of your userbase you should be concerned about code optimization and runtime. If your application/problem/solution was properly analyzed, designed, implemented, tested, and debugged you would have a much better handle on your problem.
It is also very easy to write code in PHP that is insecure. I would like to advise you to test your code thoroughly (try to break it) when writing modules that interact with the database. Poorly coded web applications can be very easy to exploit and given your user base of 300-400k I wouldn't doubt it if Cindy Lou Who suddenly decided to give it her own security audit. If a white hat hacker discovers the flaw in your system they will likely encourage you to fix it. If a black hat hacker discovers the flaw they will likely use it to spread malware and steal information.
You can use a P2P network, but it's architecturally interesting.
Using something like Kademlia as a DHT for peer discovery means talking to a limited number of nodes before reaching your target. If you stored your message at each of these hops, you'd have redundancy for your message store that may be reliable enough for your requirements. Offline delivery would mean periodic forwarding attempts for each buffered message. That would guarantee fairly low latency in terms of peer discovery, which is probably the most costly part of the problem.
Once a direct P2P connection is established, you're clearly in online mode and can skip offline storage (or not).
You can also run nodes that persistently store messages, but otherwise act as regular DHT participants. That would give more reliability, at the cost of only running a handful of nodes.
As @aridlehoover writes, though, there are so many possible answers that you can't really provide a final one.
Best Answer
Some considerations:
Choice of database: relational databases such as Mysql or Postgres are not necessarily harder to scale than MongoDB and such. In many cases it's quite the opposite. Here is a great comparison of different storage technologies: http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
"Event-driven" architecture: You have a lot of requirements that boil down to "when this happens, do something". A typical way to build an architecture like that is to think of events being emitted and event listeners acting upon those events. For example:
You would typically have a message queue as the center piece of your application, with all kinds of listeners connected to it, and all events that are emitted sent through it.
3rd party solutions: If you are doing this as a learning exercise with focus on Android development, consider using a 3rd party solution that can take care of a lot of your backend needs. One example is https://www.firebase.com/. Building a full-blown backend system for an app like this, with serious scaling and performance requirements is a lot of effort which will take your time away from learning Android.
Start simple, iterate: I'd start with a very simple back-end app without a database, and develop it in parallel with the Android app. Make sure you understand as much as possible about your requirements before you make big decisions about what database to use etc - the only real way to do that is to prototype and iterate as quickly as possible.
Now, some concrete solution "templates" that you can use as starting points:
Firebase-powered backend
Go this route if:
Firebase can take care of much of your needs. You won't need to think about the database and scaling. But you'll need to understand their data models.
Some places to start with:
Node.js + Socket.io + MongoDB + Redis
Go this route if:
Rough outline of the system:
Socket.io is good at communications: you won't need to write low-level networking code. Node.js has good performance for things like that.
Pointers:
Java Netty + RabbitMQ + MongoDB
Go this route if:
Outline:
Pointers: