Javascript – A few clarifications about the DOM

domjavascript

I have been trying to understand the DOM, and although I have a fair idea of what it is, there are certain ideas I just cannot pin down. I will list down what I think the DOM is and my questions will be inline.

The DOM is a fully object-oriented representation of the web page. The W3C DOM standard forms the basis of the DOM implemented in most modern browsers.

So does the DOM talk about how an XML/HTML document is represented as an object model?
The DOM does not specify that documents must be implemented as a tree or a grove, nor does it specify how the relationships among objects be implemented.

In what other ways can the document be represented?
When you do something like this –
```
document.write('welcome to my home page!');
```
the document object is provided by the DOM. The write methods are the interfaces that are exposed to JavaScript by the DOM.

So the objects and its methods are created as JavaScript objects by the DOM parser and then presented to the JavaScript engine? Or are the objects and methods within the DOM parsing engine in their own native language? And is exposed to the JavaScript engine? If this is so, then what is responsible for translating from JavaScript to the native language?
What are language bindings ?

The language binding is the set of objects native to the language in question that implements each of the interfaces in the DOM specification.

Developers can create language bindings from the DOM to their language simply by following the IDL (Interface Definition Language) in the DOM specification.

So if the DOM parsing engine is implemented in say C++, does that mean that when you create language bindings by following the IDL, you are just creating objects in the specific language, i.e C++ that your DOM parsing engine is built with?

Best Answer

What follows is my best reading of the relevant specifications and references. (I found Mozilla' abstracts about DOM levels and associated links especially helpful.) I encourage corrections or clarifications from others.

So does the DOM talk about how an XML/HTML document is represented as an object model?

Yes. There are two parts to DOM Level 1 specification -- Core and HTML. The Core DOM specification describes a general DOM that could be used to represent any structured document. The HTML DOM specification describes how to use the Core DOM to describe HTML documents specifically and includes HTML-specific interfaces.

The DOM does not specify that documents must be implemented as a tree or a grove, nor does it specify how the relationships among objects be implemented. In what other ways can the document be represented?

DOM Core does assume that the document is a tree. The Node interface is the "...primary datatype for the entire [DOM]. It represents a single node in the document tree." Node has several properties for accessing children, sibling, and parent nodes (e.g., parentNode, frstChild, etc.) that implies a tree structure. You could use a flat tree or a linear tree (e.g., a linked list), but it's still going to be some form of tree.

As George Mauer points out in the comments, perhaps you mean that the underlying model of an particular implementation does not need to be a tree. That much is true; as long as your implementation provides the functionality promised in the DOM specification, you can use whatever structure you like to provide that functionality.

Are the objects and methods within the DOM parsing engine in their own native language?

Generally, yes. In most browsers, the DOM is implemented in a lower-level language like C, and the browser supplies bindings to the JavaScript environment that can manipulate the actual representations. In fact, if you look at the question Meaning of “Moving DOM into Javascript”?, you'll see that Google is interested in switching to a native JavaScript DOM implementation (likely to avoid needing both a C++ function and a duplicate JavaScript wrapper for that C++ function; possibly also for performance gains).

what is responsible for translating from JavaScript to the native language?

I'm a little bit hazier on this subject, but my understanding is that when a JavaScript DOM binding is invoked, the JavaScript execution environment (which is itself implemented in a lower-level language like C) makes a call to the relevant DOM function (written in C/C++) to manipulate the DOM.

If you want to go deeper than that, you'll need to talk to someone who actually makes browsers.

does that mean that when you create language bindings by following the IDL, you are just creating objects in the specific language, i.e C++ that your DOM parsing engine is built with?

Yes. The DOM's IDL is language-agnostic, so that you can implement it in any language. "Writing a DOM implementation" means writing code (in a particular language) to conform to the IDL interfaces described in the DOM specifications.

Related Solutions

Javascript – Self-referencing anonymous closures: is JavaScript incomplete

I suppose everything is a matter of taste, but does this not look like a kludge, when all you want is a private namespace? Couldn't JavaScript implement packages and proper classes?

Most of the comments are argue against the myth that "prototypes are poor man's classes", so I'll just repeat that prototype-based OO isn't inferior in any way to class-based OO.

The other point "a kludge when all you want it a private namespace". You might be surprised to know that Scheme uses the exact same kludge to define scopes. That didn't stop from becoming the archetypal example of lexical scoping well done.

Of course, in Scheme, the 'kludge' is hidden behind macros....

JavaScript – Is JavaScript Interpreted by Design?

So am I to take it that the interpreted part is a requirement in the language specification, or is it misleading to say that the language is an interpreted programming language when respecting the difference between a language and its many implementations?

EcmaScript language geeks often use the term "ES interpreter" to refer to an implementation of EcmaScript, but the spec does not use that term. The language overview in particular describes the language in interpreter-agnostic terms:

ECMAScript is object-based: basic language and host facilities are provided by objects, and an ECMAScript program is a cluster of communicating objects.

So EcmaScript assumes a "host environment" which is defined as a provider of object definitions including all those that allow I/O or any other links to the outside world, but does not require an interpreter.

The semantics of statements and expressions in the language are defined in terms of completion specification which are trivially implemented in an interpreter, but the specification does not require that.

8.9 The Completion Specification Type

The Completion type is used to explain the behaviour of statements (break, continue, return and throw) that perform nonlocal transfers of control. Values of the Completion type are triples of the form (type, value, target), where type is one of normal, break, continue, return, or throw, value is any ECMAScript language value or empty, and target is any ECMAScript identifier or empty.

The term “abrupt completion” refers to any completion with a type other than normal.

The non-local transfers of control can be converted to arrays of instructions with jumps allowing for native or byte-code compilation.

"EcmaScript Engine" might be a better way to express the same idea.

There are no static compilers for JavaScript apparently

This is not true. The V8 "interpreter" compiles to native code internally, Rhino optionally compiles to Java bytecode internally, and various Mozilla interpreters ({Trace,Spider,Jager}Monkey) use a JIT compiler.

V8:

V8 increases performance by compiling JavaScript to native machine code before executing it, versus executing bytecode or interpreting it.

Rhino:

public final void setOptimizationLevel(int optimizationLevel)
Set the current optimization level. The optimization level is expected to be an integer between -1 and 9. Any negative values will be interpreted as -1, and any values greater than 9 will be interpreted as 9. An optimization level of -1 indicates that interpretive mode will always be used. Levels 0 through 9 indicate that class files may be generated. Higher optimization levels trade off compile time performance for runtime performance. The optimizer level can't be set greater than -1 if the optimizer package doesn't exist at run time.

TraceMonkey:

TraceMonkey adds native‐code compilation to Mozilla’s JavaScript® engine (known as “SpiderMonkey”). It is based on a technique developed at UC Irvine called “trace trees”, and building on code and ideas shared with the Tamarin Tracing project. The net result is a massive speed increase both in the browser chrome and Web‐page content.

Best Answer

Related Solutions

Javascript – Self-referencing anonymous closures: is JavaScript incomplete

JavaScript – Is JavaScript Interpreted by Design?

8.9 The Completion Specification Type

Related Topic