TL;DR
In Java, the reason of public static void main(String[] args)
is that
- Gosling wanted
- the code written by someone experienced in C (not in Java)
- to be executed by someone used to running PostScript on NeWS
For C#, the reasoning is transitively similar so to speak. Language designers kept the program entry point syntax familiar for programmers coming from Java. As C# architect Anders Hejlsberg puts it,
...our approach with C# has simply been to offer an alternative... to Java programmers...
Long version
expanding above and backed up with boring references.
java Terminator Hasta la vista Baby!
VM Spec, 2.17.1 Virtual Machine Start-up
...The manner in which the initial class is specified to the Java virtual machine is beyond the scope of this specification, but it is typical, in host environments that use command lines, for the fully qualified name of the class to be specified as a command-line argument and for subsequent command-line arguments to be used as strings to be provided as the argument to the method main. For example, using Sun's Java 2 SDK for Solaris, the command line
java Terminator Hasta la vista Baby!
will start a Java virtual machine by invoking the method main of class Terminator
(a class in an unnamed package) and passing it an array containing the four strings "Hasta", "la", "vista", and "Baby!"...
...see also: Appendix: I need your clothes, your boots and your motorcycle
- My interpretation:
execution targeted for use like typical scripts in command line interface.
important sidestep
...that helps avoid a couple of false traces in our investigation.
VM Spec, 1.2 The Java Virtual Machine
The Java virtual machine knows nothing of the Java programming language...
I noticed above when studying prior chapter - 1.1 History which I thought could be helpful (but turned out useless).
- My interpretation:
execution is governed by VM spec alone, which
explicitly declares that it has nothing to do with Java language
=> OK to ignore JLS and anything Java language related at all
Gosling: a compromise between C and scripting language...
Based on above, I began searching the web for JVM history. Didn't help, too much garbage in results.
Then, I recalled legends about Gosling and narrowed down my search to Gosling JVM history.
Eureka! How The JVM Spec Came To Be
In this keynote from the JVM Languages Summit 2008, James Gosling discusses... Java's creation,... a compromise between C and scripting language...
- My interpretation:
explicit declaration that at the moment of creation,
C and scripting have been considered most important influences.
Already seen nod to scripting in VM Spec 2.17.1,
command line arguments sufficiently explain String[] args
but static
and main
aren't there yet, need to dig further...
Note while typing this - connecting C, scripting and VM Spec 1.2 with its nothing-of-Java - I feel like something familiar, something... object oriented is slowly passing away. Take my hand and keep movin' Don't slow down we're nearly there now
Keynote slides are available online: 20_Gosling_keynote.pdf, quite convenient for copying key points.
page 3
The Prehistory of Java
* What shaped my thinking
page 9
NeWS
* Networked Extensible Window System
* A window system based on scripting....
PostScript (!!)
page 16
A Big (but quiet) Goal:
How close could I get to a
"scripting" feel...
page 19
The original concept
* Was all about building
networks of things,
orchestrated by a scripting
language
* (Unix shells, AppleScript, ...)
page 20
A Wolf in Sheeps Clothing
* C syntax to make developers
comfortable
A-ha! Let's look closer at C syntax.
The "hello, world" example...
main()
{
printf("hello, world\n");
}
...a function named main is being defined. The main function serves a special purpose in C programs; the run-time environment calls the main function to begin program execution.
...The main function actually has two arguments, int argc
and char *argv[]
, respectively, which can be used to handle command line arguments...
Are we getting closer? you bet. It is also worth following "main" link from above quote:
the main function is where a program starts execution. It is responsible for the high-level organization of the program's functionality, and typically has access to the command arguments given to the program when it was executed.
- My interpretation:
To be comfortable for C developer, program entry point has to be main
.
Also, since Java requires any method to be in class, Class.main
is
as close as it gets: static invocation, just class name and dot,
no constructors please - C knows nothing like that.
This also transitively applies to C#, taking into account
the idea of easy migration to it from Java.
Readers thinking that familiar program entry point doesn't matter are kindly invited to search and check Stack Overflow questions where guys coming from Java SE are trying to write Hello World for Java ME MIDP. Note MIDP entry point has no main
nor static
.
Conclusion
Based on above I would say that static
, main
and String[] args
were at the moments of Java and C# creation most reasonable choices to define program entry point.
Appendix: I need your clothes, your boots and your motorcycle
Have to admit, reading VM Spec 2.17.1 was enormous fun.
...the command line
java Terminator Hasta la vista Baby!
will start a Java virtual machine by invoking the method main of class Terminator
(a class in an unnamed package) and passing it an array containing the four strings "Hasta", "la", "vista", and "Baby!".
We now outline the steps the virtual machine may take to execute Terminator
, as an example of the loading, linking, and initialization processes that are described further in later sections.
The initial attempt... discovers that the class Terminator
is not loaded...
After Terminator
is loaded, it must be initialized before main can be invoked, and a type (class or interface) must always be linked before it is initialized. Linking (§2.17.3) involves verification, preparation, and (optionally) resolution...
Verification (§2.17.3) checks that the loaded representation of Terminator
is well formed...
Resolution (§2.17.3) is the process of checking symbolic references from class Terminator
...
Symbolic references from Terminator
oh yeah.
Consider this, let's say we got rid of all loops in Java (the compiler writers are on strike or something). Now we want to write factorial, so we might right something like this
int factorial(int i){ return factorial(i, 1);}
int factorial(int i, int accum){
if(i == 0) return accum;
return factorial(i-1, accum * i);
}
Now we're feeling pretty clever, we've managed to write our factorial even without loops! But when we test, we notice that with any reasonably sized number, we're getting stackoverflow errors since there's no TCO.
In real Java this isn't a problem. If we ever have a tail recursive algorithm, we can transform it into a loop and be just fine. However, what about languages with no loops? Then you're just hosed. That's why clojure has this recur
form, without it, it's not even turing complete (No way to do infinite loops).
The class of functional languages that target the JVM, Frege, Kawa (Scheme), Clojure are always trying to deal with the lack of tail calls, because in these languages, TC is the idiomatic way of doing loops! If translated to Scheme, that factorial above would be a good factorial. It'd be awfully inconvenient if looping 5000 times made your program crash. This can be worked around though, with recur
special forms, annotations hinting at optimizing self calls, trampolining, whatever. But they all force either performance hits or unnecessary work on the programmer.
Now Java doesn't get off free either, since there's more to TCO then just recursion, what about mutually recursive functions? They can't be straightforwardly translated to loops, but are still unoptimized by the JVM. This makes it spectacularly unpleasant to try to write algorithms using mutual recursion using Java since if you want decent performance/range you have to do dark magic to get it to fit into loops.
So, in summary, this isn't a huge deal for many cases. Most tail calls either only proceed one stackframe deep, with things like
return foo(bar, baz); // foo is just a simple method
or are recursion. However, for the class of TC that don't fit into this, every JVM language feels the pain.
However, there is a decent reason why we don't yet have TCO. The JVM gives us stack traces. With TCO we systematically eliminate stackframes that we know are "doomed", but the JVM might actually want these later for a stacktrace! Say we implement a FSM like this, where each state tail-calls the next. We'd erase all record of previous states so a traceback would show us what state, but not anything about how we got there.
Additionally, and more pressingly, much of bytecode verification is stack based, eliminating the thing that lets us verify bytecode is not pleasant prospect. Between this and the fact that Java has loops, TCO looks like a bit more trouble than it's worth to the JVM engineers.
Best Answer
As explained by Brian Goetz (Java Language Architect at Oracle) in this video:
Anything that changed the number of frames on the stack would break this and would cause an error. He admits this was a stupid reason, and so the JDK developers have since replaced this mechanism.
He further then mentions that it's not a priority, but that tail recursion
N.B. This applies to HotSpot and the OpenJDK, other VMs may vary.