Databases are out of process - they run on a different server usually. So even if you had an API, it would need to send something across the wire that represents your query and all of its projections, filters, groups, subqueries, expressions, joins, aggregate functions etc. That something could be XML or JSON or some proprietary format, but it may as well be SQL because that is tried, tested and supported.
It is less common these days to build up SQL commands yourself - many people use some sort of ORM. Even though these ultimately translate into SQL statements, they may provide the API you are after.
Deep down, in the guts of a relational database, its all rows and columns. That is the structure that a relational database is optimized to work with. Cursors work on individual rows at a time. Some operations create temporary tables (again, it needs to be rows and columns).
By working with only rows and returning only rows, the system is able to better deal with memory and network traffic.
As mentioned, this allows for certain optimizations to be done (indexes, joins, unions, etc...)
If one was to want a nested tree structure, this requires that one pulls all the data at once. Gone are the optimizations for the cursors on the database side. Likewise, the traffic over the network becomes one big burst that can take much longer than the slow trickle of row by row (this is something that is occasionally lost in today's web world).
Every language has arrays within it. These are easy things to work with and interface with. By using a very primitive structure, the driver between the database and program - no matter what language - can work in a common way. Once one starts adding trees, the structures in the language become more complex and more difficult to traverse.
It isn't that hard for a programing language to convert the rows returned into some other structure. Make it into a tree or a hash set or leave it as a list of rows that you can iterate over.
There is also history at work here. Transferring structured data was something ugly in the days of old. Look at the EDI format to get an idea of what you might be asking for. Trees also imply recursion - which some languages didn't support (the two most important languages of the old days didn't support recursion - recursion didn't enter Fortran until F90 and of the era COBOL didn't either).
And while the languages of today have support for recursion and more advanced data types, there isn't really a good reason to change things. They work, and they work well. The ones that are changing things are the nosql databases. You can store trees in documents in a document based one. LDAP (its actually oldish) is also a tree based system (though its probably not what you're after). Who knows, maybe the next thing in nosql databases will be one that returns back the query as a json object.
However, the 'old' relational databases... they're working with rows because thats what they're good at and everything can talk to them without trouble or translation.
- In protocol design, perfection has been reached not when there is nothing left to add, but when there is nothing left to take away.
From RFC 1925 - The Twelve Networking Truths
Best Answer
There are too many cases where using a literal is the right approach.
From a performance standpoint, there are times that you want literals in your queries. Imagine I have a bug tracker where once it gets big enough to worry about performance I expect that 70% of the bugs in the system will be "closed", 20% will be "open", 5% will be "active" and 5% will be in some other status. I may reasonably want to have the query that returns all active bugs to be
rather than passing the
status
as a bind variable. I want a different query plan depending on the value passed in forstatus
-- I'd want to do a table scan to return the closed bugs and an index scan on thestatus
column to return the active loans. Now, different databases and different versions have different approaches to (more or less successfully) allow the same query to use a different query plan depending on the value of the bind variable. But that tends to introduce a decent amount of complexity that needs to be managed to balance out the decision of whether to bother re-parsing a query or whether to reuse an existing plan for a new bind variable value. For a developer, it may make sense to deal with this complexity. Or it may make sense to force a different path when I have more information about what my data is going to look like than the optimizer does.From a code complexity standpoint, there are also plenty of times that it makes perfect sense to have literals in SQL statements. For example, if you have a
zip_code
column that has a 5 character zip code and sometimes has an additional 4 digits, it makes perfect sense to do something likerather than passing in 4 separate parameters for the numeric values. These aren't things that will ever change so making them bind variables only serves to make the code potentially more difficult to read and to create the potential that someone will bind parameters in the wrong order and end up with a bug.