I keep hearing people (Crockford in particular) saying the DOM is a terrible API, but not really justifying this statement. Apart from cross-browser inconsistencies, what are some reasons why the DOM is considered to be so bad?
Javascript – What’s so bad about the DOM
apiapi-designdomjavascript
Related Solutions
What follows is my best reading of the relevant specifications and references. (I found Mozilla' abstracts about DOM levels and associated links especially helpful.) I encourage corrections or clarifications from others.
So does the DOM talk about how an XML/HTML document is represented as an object model?
Yes. There are two parts to DOM Level 1 specification -- Core and HTML. The Core DOM specification describes a general DOM that could be used to represent any structured document. The HTML DOM specification describes how to use the Core DOM to describe HTML documents specifically and includes HTML-specific interfaces.
The DOM does not specify that documents must be implemented as a tree or a grove, nor does it specify how the relationships among objects be implemented. In what other ways can the document be represented?
DOM Core does assume that the document is a tree. The Node
interface is the "...primary datatype for the entire [DOM]. It represents a single node in the document tree." Node
has several properties for accessing children, sibling, and parent nodes (e.g., parentNode
, frstChild
, etc.) that implies a tree structure. You could use a flat tree or a linear tree (e.g., a linked list), but it's still going to be some form of tree.
As George Mauer points out in the comments, perhaps you mean that the underlying model of an particular implementation does not need to be a tree. That much is true; as long as your implementation provides the functionality promised in the DOM specification, you can use whatever structure you like to provide that functionality.
Are the objects and methods within the DOM parsing engine in their own native language?
Generally, yes. In most browsers, the DOM is implemented in a lower-level language like C, and the browser supplies bindings to the JavaScript environment that can manipulate the actual representations. In fact, if you look at the question Meaning of “Moving DOM into Javascript”?, you'll see that Google is interested in switching to a native JavaScript DOM implementation (likely to avoid needing both a C++ function and a duplicate JavaScript wrapper for that C++ function; possibly also for performance gains).
what is responsible for translating from JavaScript to the native language?
I'm a little bit hazier on this subject, but my understanding is that when a JavaScript DOM binding is invoked, the JavaScript execution environment (which is itself implemented in a lower-level language like C) makes a call to the relevant DOM function (written in C/C++) to manipulate the DOM.
If you want to go deeper than that, you'll need to talk to someone who actually makes browsers.
does that mean that when you create language bindings by following the IDL, you are just creating objects in the specific language, i.e C++ that your DOM parsing engine is built with?
Yes. The DOM's IDL is language-agnostic, so that you can implement it in any language. "Writing a DOM implementation" means writing code (in a particular language) to conform to the IDL interfaces described in the DOM specifications.
The proxy option is the easiest one to implement. You don't have any custom development to do, the only thing to do is to set up a proxy. It's also straightforward: there is no additional code to maintain, and if the API changes, you have no changes to make on your side.
A proxy would be a preferred choice:
If you need to ship working software fast. This makes it a good choice, for example, if you were about to ship a feature, but found during the implementation phase of the project that you can't just make cross-domain AJAX requests.
Or if the current API is well designed: the architecture is good, the calls are very clear, the documentation is complete and easy to understand.
Or if the current API is subject to change. If it changes, you just need to change the JavaScript implementation. If instead of a proxy, you are parsing the results and generating your own JSON, there is a risk that the changes to the API will require the changes in your server-side code.
On the other hand, parsing the result has a benefit to make it possible to abstract completely the API on client-side. This is a slower alternative, since it requires to design the new interface (if the original API is not well designed) and to implement the extract, transform and load features, but it may be a good long-term choice for a large project. This is a preferred choice:
If you need additional features. You can exploit the different features which weren't available in the original API, such as caching on a level which is not supported by an ordinary proxy server, or encryption, or a different authentication model.
For example, if the number of AJAX requests becomes an issue or if two-ways communication model makes sense, you can implement Web Sockets.
Or if the current API is not well designed. Like a facade pattern, this approach enables you to redesign the API. If the original one is poor, having a facade makes it possible to solve the bad design choices made by the original authors of a legacy API. You can act as well on large parts, such as the overall architecture of the API, but also on details, such as the names of arguments or the error messages.
While modifying an existent API is sometimes impossible, having a facade can make it possible to work with a piece of clean code which abstracts the drawbacks and errors in the original design.
Or if the current API is subject to change. Indeed, you may prefer to change server-side code instead of JavaScript if the API changes over time, while keeping the public interface of your facade unaffected. It may be easier either because you're more experienced with server-side programming or because you know more tools for server-side refactoring or because it's easier in your project to deal with server-side code versioning.
You may notice that I omitted talking about JSON, performance, caching, etc. There is a reason for that:
JSON vs. XML: it's up to you to pick the right technology. You do it by measuring objectively the overheat of XML over JSON, the time it takes to serialize data and the ease of parsing.
Performance: benchmark different implementations, pick the fastest one, then profile it and optimize it based on the results from the profiler. Stop when you achieve the performance specified in the non-functional requirements.
Also, understand what are you trying to achieve. There are several parts interacting with each other: the original API, the bandwidth between your server and the API's one, the performance of your server, the bandwidth between your server and the end users and the performance of their machines. If you're asked to obtain a response to a request within 30 ms., but the original API spends 40 ms. processing the request, no matter what you do, you won't be able to obtain the required performance.
Caching: caching is one of the techniques to make your web application feeling faster, reducing bandwidth, etc.
Make sure you use client caching as well (server-side caching won't reduce bandwidth usage between you and the customers), given that setting up the HTTP headers properly is often tricky.
Make sure you determine correctly what to cache, for how long and when to invalidate it: if the description of the product changed 10 seconds ago, but the customers of an e-commerce website still see the old version, it's OK. If the owner changed the description, submitted it, and still sees the previous variant because of the caching, this is problematic.
Don't focus only on caching. Minification, for example, is important as well. Reducing the number of requests can also be beneficial.
Best Answer
Crockford has given an extensive presentation titled "An Inconvenient API: The Theory of the Dom" where he more or less explains his opinions on the DOM. It's longish (1h 18m), but as most Crockford's presentations it's quite enjoyable and educative.
Cross browsers inconsistencies seems to be his main concern, and I agree it's the single most annoying thing about the DOM. He identifies:
as the key issues behind the various inconsistencies, adding that presentation, session, or interactivity was never anticipated in the original vision of the web. Some examples of the inconsistencies include:
document.all
, a Microsoft only feature,name
andid
used to be interchangeable.document.getElementById(id)
,document.getElementsByName(name)
,*node*.getElementsByTagName(tagName)
)and continues with a few more examples, mostly targeting traversing the DOM, memory leaks, and event trickling and bubbling. There is a summary slide, titled "The Cracks of DOM" that summarizes:
In short, it's a messy, messy API. It might seem like nitpicking, but you have to keep in mind that when you are developing for the web, you rarely get to pick the browser your customers will use. Having to test everything in at least two versions of each of the major browsers gets old very soon. An API is supposed to be consistent and the DOM was a victim of the browser wars, but it's getting better. It's still not as platform neutral as the W3C (and I think all of us) would like it to be, but browser vendors seem quite more eager to co-operate than they were five or ten years ago.