C. J. Date goes into detail about this in Chapter 7 and Appendix B of SQL and Relational Theory. You're right, there's nothing in relational theory that prohibits an attribute's data type from being a relation itself, as long as it's the same relation type on every row. Your example would qualify.
But Date says structures like this are "usually--but not invariably--contraindicated" (i.e. a Bad Idea) because hierarchies of relations are asymmetric. For example, a transformation from nested structure to a familiar "flat" structure cannot always be reversed to recreate the nesting.
Queries, constraints, and updates are more complex, harder to write, and harder for the RDBMS to support if you allow relation-valued attributes (RVA's).
It also muddies database design principles, because the best hierarchy of relations isn't so clear. Should we design a relation of Suppliers with a nested RVA for parts supplied by a given Supplier? Or a relation of Parts with a nested RVA for suppliers who supply a given Part? Or store both, to make it easy to run different types of queries?
This is the same dilemma that results from the hierarchical database and the document-oriented database models. Eventually, the complexity and cost of accessing nested data structures drives designers to store data redundantly for easier lookup by different queries. The relational model discourages redundancy, so RVA's can work against the goals of relational modeling.
From what I understand (I have not used them), Rel and Dataphor are RDBMS projects that support relation-valued attributes.
Re comment from @dportas:
Structured types are part of SQL-99, and Oracle supports these. But they don't store multiple tuples in the nested table per row of the base table. The common example is an "address" attribute which appears to be a single column of the base table, but has further sub-columns for street, city, postal code, etc.
Nested tables are also supported by Oracle, and these do allow multiple tuples per row of the base table. But I am not aware that this is part of standard SQL. And keep in mind the conclusion of one blog: "I'll never use a nested table in a CREATE TABLE statement. You spend all of your time UN-NESTING them to make them useful again!"
How big a data?
There are two significant thresholds:
- whole data fits in the RAM
- whole index data fits in the RAM
With fast SSDs the first threshold became bit less of an issue, unless you have crazy high traffic.
ACIDity
One of the problem with scaling RDBMSes is that by design they are ACID, which means transactions and row level locks (or even table level in some older/simpler RDBMSes). It can be limiting factor if you have lot of queries modifying lot of data running at same time. NoSQL solutions usually go for eventual consistency model.
How do RDBMS scale on data size?
It's not entirely true that RDBMS cannot scale on data size, there are two alternatives: vertical partitioning and horizontal partitioning (aka sharding).
Vertical partitioning is basically keeping unrelated tables on separate DB servers, thus keeping size of each one below thresholds mentioned above. This makes join these tables using plain SQL less straight forward and less efficient.
Sharding means distributing data from one table among various servers, based on specific key. This means that for look ups you know which server to query based on that key. However, this complicates queries that are not look ups on the sharding key.
In case of both kind of partitioning, if you go to extremes, you basically end up with same situation as NoSQL databases.
Best Answer
Just because your NoSql database doesn't have a schema in a traditional sense doesn't mean there isn't a logical schema you need to deal with as it changes. In the case of a typical app using MongoDb, most likely your code expects certain fields of the json object to behave in certain ways. If you change the behavior, it follows you might want to update the already existing data in the database. Now, with traditional RDBMS this was a largely solved problem -- you just had to ALTER the underlying tables. But with these newfangled NoSQL databases, you have a decision -- do you write a script to munge and update all your objects? Or do you add code to convert between versions on the fly? If so, how long do you support v1 objects? Forever? Until v3?
I'll add that the example used in the MongoDb blog post is a bit simplistic and a very easy case to handle if you've got a decent update process no matter what the RDBMS is; adding a field rarely hurts. It is when you decide to split your
Name
field intoFirstName
andLastName
that things get exciting.