If one needs different JVMs for different architectures I can't figure out what is the logic behind introducing this concept. In other languages we need different compilers for different machines, but in Java we require different JVMs so what is the logic behind introducing the concept of a JVM or this extra step??
Java Bytecode – The Use of Converting Source Code to Java Bytecode
bytecodejavajvm
Related Solutions
You can write a compiler that implements the Java Language Specification or write a JVM that implements the Java Virtual Machine specification, but when you officially want to call it "Java", you have to prove it is compatible by passing the tests of the TCK (technology compatibility kit) and pay for a license from Oracle.
Oracle doesn't make it easy for other parties to do this, though. Apache has their own implementation of the JVM (Apache Harmony) but previously Sun, now Oracle, is not cooperating in making the TCK available nor let Apache get a license, which has led to a lot of resentment between Apache and Oracle.
Long ago Microsoft had their own version of Java (that was indeed called "Java"). They tried to change it to make it Windows-specific, which Sun of course didn't like. There was a lawsuit, Microsoft lost, quit their own Java version and created .NET, which is a completely different thing that just happens to work a lot like how Java works...
The lawsuit about Android isn't based on this at all; Google isn't saying that Android is Java. That lawsuit is about patents; Oracle has patents on a number of ideas and concepts in their own JVM implementation and is claiming that Google is using the same patented ideas in Android without getting a patent license from Oracle.
Now, I don't know much about Clojure and little about Scala, but I'll give it a shot.
First off, we need to differentiate between tail-CALLs and tail-RECURSION. Tail recursion is indeed rather easy to transform into a loop. With tail calls, it's much harder to impossible in the general case. You need to know what is being called, but with polymorphism and/or first-class functions, you rarely know that, so the compiler cannot know how to replace the call. Only at runtime you know the target code and can jump there without allocating another stack frame. For instance, the following fragment has a tail call and does not need any stack space when properly optimized (including TCO), yet it cannot be eliminated when compiling for the JVM:
function forward(obj: Callable<int, int>, arg: int) =
let arg1 <- arg + 1 in obj.call(arg1)
While it's just a tad inefficient here, there are whole programming styles (such as Continuation Passing Style or CPS) which have tons of tail calls and rarely ever return. Doing that without full TCO means you can only run tiny bits of code before running out of stack space.
What facility of the underlying virtual machine would allow the compiler to handle TCO more easily?
A tail call instruction, such as in the Lua 5.1 VM. Your example does not get much simpler. Mine becomes something like this:
push arg
push 1
add
load obj
tailcall Callable.call
// implicit return; stack frame was recycled
As a sidenote, I would not expect actual machines to be much smarter than the JVM.
You're right, they aren't. In fact, they are less smart and thus don't even know (much) about things like stack frames. That's precisely why one can pull tricks like re-using stack space and jumping to code without pushing a return address.
Best Answer
The logic is that JVM bytecode is a lot simpler than Java source code.
Compilers can be thought of, at a highly abstract level, as having three basic parts: parsing, semantic analysis, and code generation.
Parsing consists of reading the code and turning it into a tree representation inside the compiler's memory. Semantic analysis is the part where it analyzes this tree, figures out what it means, and simplifies all the high-level constructs down to lower-level ones. And code generation takes the simplified tree and writes it out into a flat output.
With a bytecode file, the parsing phase is greatly simplified, since it's written in the same flat byte stream format that the JIT uses, rather than a recursive (tree-structured) source language. Also, a lot of the heavy lifting of the semantic analysis has already been performed by the Java (or other language) compiler. So all it has to do is stream-read the code, do minimal parsing and minimal semantic analysis, and then perform code generation.
This makes the task the JIT has to perform a lot simpler, and therefore a lot faster to execute, while still preserving the high-level metadata and semantic information that makes it possible to theoretically write single-source, cross-platform code.