First, I am going to suggest that you may want to post follow-up questions to dba.se. Multi-tenant applications are tricky and different RDBMSs have different tools for solving some of the problems. Without knowing exactly what your environment is, specific answers are often impossible. So I am going to look at conceptual problems and sample answers here.
For customization of cardholder data, my recommendation generally is to store the extra fields in something like XML or JSON format because this gives you a basic framework for storage which is flexible and gives your application a way to handle the data without eav pains. Some db's may even be able to index xpath search results against xml records.
Multi-tenant security is a very different matter and it is hard. The traditional standby is to use a view which can be used to filter out rows only to those which are authorized. Oracle has some ways to bolt these on to the table directly. PostgreSQL has some ways to make sure that other functions applied don't leak information.
If this becomes large, some sort of division of storage is going to be needed. How this gets divided by partitioning tables will be something you will have to decide based on your observed workload. If you are on Oracle thinking about being able to move to RAC as needed may be helpful, and if you are on PostgreSQL, Postgres-XC is the technology to keep in mind. However your current requirements make a normal sharding scheme somewhat questionable and it isn't clear what your usage patterns are going to be. How many cross-tenant cardholders are there likely to be? How much tenant vs cardholder queries are there going to be? These are the sorts of decisions which really should be made when they become problems and premature optimization here will make thins hard.
There are a few alternatives:
1. Replace dots by a dash.
This would be my favorite approach, as it keeps the structure explicit enough.
Since according to you, “it is pretty much a one time insertion,” it should be relatively simple to check if it doesn't break anything (i.e. there is already a same key with a dash). For other situations, doing those checks programmatically requires to write some code, but is still a relatively easy task.
I would strongly advise against this approach, since it would lead to massive debugging headaches down the road. Letting someone who uses the resulting JSON somewhere in the code far away from MongoDB to guess that a dot is not really a dot is a good way to waste literally weeks of someone's time. Keep such Unicode tricks to the hackers who want to trick someone into thinking that a character is a different one.
3. Use BSON.
Since you claim that you “would mostly retrieve the entire document rather than querying for parts of it,” this approach doesn't have major drawbacks in your case. Although, you said “mostly,” which means that sometimes, you will retrieve only parts of the document.
In general, the drawback is that you won't be able to search through the document or to load only a part of it.
4. Use a standard encoding, such as Base64.
Converting the problematic keys (or all keys, depending on the ratio between problematic and non-problematic ones) to Base64 or hexadecimal could be a viable solution, with the benefit of being rather explicit: most developers would recognize Base64 or hexadecimal values at a glance.
The drawback is the increased memory footprint, as well as the necessity to encode and decode the keys when using them.
I would strongly advise against this approach, since is would make the data query ambiguous, and waste hours or days trying to figure out why a specific query doesn't do what you imagined it should be doing. Dot is a reserved character and the check is here to protect you; by telling MongoDB to skip the check, you will only postpone the moment where you'll have to deal with a conflict between the syntax of MongoDB and the reserved character used in a key.
Best Answer
It means applications don't have to care about what machines the shards are located on. You have to know where at least one of the query routers are because that's what you tell your driver in order to access the system. In practice the query routers may or may not be on the same physical machines as the shards, and query routers are separate processes from the application, not necessarily separate physical machines. Most commonly they are on the same machine as the application servers, but it depends on your individual needs.
This is in contrast to some databases where the application has to be aware of the sharding. For example, you might have to connect to a different server to query last names beginning with 'A' than last names beginning with 'B'.