Virtual Machines – Why Don’t VMs Execute Assembly Directly?

virtual machine

Many VMs execute a language of binary form, knows as 'bytecode', which is assembled down from a human readable 'assembly' language.

For example the assembly instructions push 1 push 2 add are translated (I think) to a series of ones and zeroes, which is then executed by the VM.

Why? Why don't VMs, and the JVM as an example, execute the assembly instructions directly?

They don't have the limitation of physical computers that can only handle ones and zeroes. The JVM can very well take textual instructions such as push 1 push 2 and execute them as they are. Why the additional step of compilation?

Best Answer

Here are a couple of reasons to think about:

Using human readable assembly language would waste space on disk and in memory. That has an impact on caching, and therefore on performance. In your example the instruction 'push' takes up four bytes. Why not compress the program by using one byte tokens for all instructions instead of the human readable strings?

It wastes cycles on the processor. Your VM probably has at least two instruction mnemonics that start with 'p'. In order for your VM to figure out whether an instruction is 'push' or 'pop' it has to compare at least two bytes. It's much more efficient if each instruction can be uniquely identified by looking at single byte. The argument to your instructions is a string representing a number. The string has to be converted to a binary format appropriate for they underlying CPU before it can be used in arithmetic. That conversion will take dozens of instructions all by itself. Why do that every time the program is run? It's much more efficient to do it in a one-time pass when the byte code is created.