Java is always pass-by-value. Unfortunately, when we deal with objects we are really dealing with object-handles called references which are passed-by-value as well. This terminology and semantics easily confuse many beginners.
It goes like this:
public static void main(String[] args) {
Dog aDog = new Dog("Max");
Dog oldDog = aDog;
// we pass the object to foo
foo(aDog);
// aDog variable is still pointing to the "Max" dog when foo(...) returns
aDog.getName().equals("Max"); // true
aDog.getName().equals("Fifi"); // false
aDog == oldDog; // true
}
public static void foo(Dog d) {
d.getName().equals("Max"); // true
// change d inside of foo() to point to a new Dog instance "Fifi"
d = new Dog("Fifi");
d.getName().equals("Fifi"); // true
}
In the example above aDog.getName()
will still return "Max"
. The value aDog
within main
is not changed in the function foo
with the Dog
"Fifi"
as the object reference is passed by value. If it were passed by reference, then the aDog.getName()
in main
would return "Fifi"
after the call to foo
.
Likewise:
public static void main(String[] args) {
Dog aDog = new Dog("Max");
Dog oldDog = aDog;
foo(aDog);
// when foo(...) returns, the name of the dog has been changed to "Fifi"
aDog.getName().equals("Fifi"); // true
// but it is still the same dog:
aDog == oldDog; // true
}
public static void foo(Dog d) {
d.getName().equals("Max"); // true
// this changes the name of d to be "Fifi"
d.setName("Fifi");
}
In the above example, Fifi
is the dog's name after call to foo(aDog)
because the object's name was set inside of foo(...)
. Any operations that foo
performs on d
are such that, for all practical purposes, they are performed on aDog
, but it is not possible to change the value of the variable aDog
itself.
For more information on pass by reference and pass by value, consult the following SO answer: https://stackoverflow.com/a/430958/6005228. This explains more thoroughly the semantics and history behind the two and also explains why Java and many other modern languages appear to do both in certain cases.
The stack is the memory set aside as scratch space for a thread of execution. When a function is called, a block is reserved on the top of the stack for local variables and some bookkeeping data. When that function returns, the block becomes unused and can be used the next time a function is called. The stack is always reserved in a LIFO (last in first out) order; the most recently reserved block is always the next block to be freed. This makes it really simple to keep track of the stack; freeing a block from the stack is nothing more than adjusting one pointer.
The heap is memory set aside for dynamic allocation. Unlike the stack, there's no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or freed at any given time; there are many custom heap allocators available to tune heap performance for different usage patterns.
Each thread gets a stack, while there's typically only one heap for the application (although it isn't uncommon to have multiple heaps for different types of allocation).
To answer your questions directly:
To what extent are they controlled by the OS or language runtime?
The OS allocates the stack for each system-level thread when the thread is created. Typically the OS is called by the language runtime to allocate the heap for the application.
What is their scope?
The stack is attached to a thread, so when the thread exits the stack is reclaimed. The heap is typically allocated at application startup by the runtime, and is reclaimed when the application (technically process) exits.
What determines the size of each of them?
The size of the stack is set when a thread is created. The size of the heap is set on application startup, but can grow as space is needed (the allocator requests more memory from the operating system).
What makes one faster?
The stack is faster because the access pattern makes it trivial to allocate and deallocate memory from it (a pointer/integer is simply incremented or decremented), while the heap has much more complex bookkeeping involved in an allocation or deallocation. Also, each byte in the stack tends to be reused very frequently which means it tends to be mapped to the processor's cache, making it very fast. Another performance hit for the heap is that the heap, being mostly a global resource, typically has to be multi-threading safe, i.e. each allocation and deallocation needs to be - typically - synchronized with "all" other heap accesses in the program.
A clear demonstration:
Image source: vikashazrati.wordpress.com
Best Answer
First and foremost, the "pass by value vs. pass by reference" distinction as defined in the CS theory is now obsolete because the technique originally defined as "pass by reference" has since fallen out of favor and is seldom used now.1
Newer languages2 tend to use a different (but similar) pair of techniques to achieve the same effects (see below) which is the primary source of confusion.
A secondary source of confusion is the fact that in "pass by reference", "reference" has a narrower meaning than the general term "reference" (because the phrase predates it).
Now, the authentic definition is:
When a parameter is passed by reference, the caller and the callee use the same variable for the parameter. If the callee modifies the parameter variable, the effect is visible to the caller's variable.
When a parameter is passed by value, the caller and callee have two independent variables with the same value. If the callee modifies the parameter variable, the effect is not visible to the caller.
Things to note in this definition are:
"Variable" here means the caller's (local or global) variable itself -- i.e. if I pass a local variable by reference and assign to it, I'll change the caller's variable itself, not e.g. whatever it is pointing to if it's a pointer.
The meaning of "reference" in "pass by reference". The difference with the general "reference" term is that this "reference" is temporary and implicit. What the callee basically gets is a "variable" that is somehow "the same" as the original one. How specifically this effect is achieved is irrelevant (e.g. the language may also expose some implementation details -- addresses, pointers, dereferencing -- this is all irrelevant; if the net effect is this, it's pass-by-reference).
Now, in modern languages, variables tend to be of "reference types" (another concept invented later than "pass by reference" and inspired by it), i.e. the actual object data is stored separately somewhere (usually, on the heap), and only "references" to it are ever held in variables and passed as parameters.3
Passing such a reference falls under pass-by-value because a variable's value is technically the reference itself, not the referred object. However, the net effect on the program can be the same as either pass-by-value or pass-by-reference:
As you may see, this pair of techniques is almost the same as those in the definition, only with a level of indirection: just replace "variable" with "referenced object".
There's no agreed-upon name for them, which leads to contorted explanations like "call by value where the value is a reference". In 1975, Barbara Liskov suggested the term "call-by-object-sharing" (or sometimes just "call-by-sharing") though it never quite caught on. Moreover, neither of these phrases draws a parallel with the original pair. No wonder the old terms ended up being reused in the absence of anything better, leading to confusion.4
NOTE: For a long time, this answer used to say:
This is mostly correct except the narrower meaning of "reference" -- it being both temporary and implicit (it doesn't have to, but being explicit and/or persistent are additional features, not a part of the pass-by-reference semantic, as explained above). A closer analogy would be giving you a copy of a document vs inviting you to work on the original.
1Unless you are programming in Fortran or Visual Basic, it's not the default behavior, and in most languages in modern use, true call-by-reference is not even possible.
2A fair amount of older ones support it, too
3In several modern languages, all types are reference types. This approach was pioneered by the language CLU in 1975 and has since been adopted by many other languages, including Python and Ruby. And many more languages use a hybrid approach, where some types are "value types" and others are "reference types" -- among them are C#, Java, and JavaScript.
4There's nothing bad with recycling a fitting old term per se, but one has to somehow make it clear which meaning is used each time. Not doing that is exactly what keeps causing confusion.