I believe it comes from the very early days of computing, when memory was very limited, and it was not wise to pre-allocate a large chunk of memory for exclusive use by the stack. So, by allocating heap memory from address zero upwards, and stack memory from the end of the memory downwards, you could have both the heap and the stack share the same area of memory.
If you needed a bit more heap, you could be careful with your stack usage; if you needed more stack, you could try to free some heap memory. The result was, of course, mostly, spectacular crashes, as the stack would occasionally overwrite the heap and vice versa.
Back in those days there were no interwebz, so there was no issue of buffer overrun exploitations. (Or at least to the extent that the interwebz existed, it was all within high security facilities of the united states department of defense, so the possibility of malicious data did not need to be given much thought.)
After that, with most architectures it was all a matter of maintaining compatibility with previous versions of the same architecture. That's why upside-down stacks are still with us today.
Depending on the language, it may not be necessary to use a call stack. Call stacks are only necessary in languages that allow recursion or mutual recursion. If the language does not allow recursion, then only one invocation of any procedure may be active at any moment, and local variables for that procedure may be statically allocated. Such languages do have to make provision for context changes, for interrupt handling, but this still does not require a stack.
Refer to FORTRAN IV (and earlier) and early versions of COBOL for examples of languages that do not require call stacks.
Refer to the Control Data 6600 (and earlier Control Data machines) for an example of a highly-successful early supercomputer that did not provide direct hardware support for call stacks. Refer to the PDP-8 for an example of a very successful early minicomputer that did not support call stacks.
As far as I know, the Burroughs B5000 stack machines were the first machines with hardware call stacks. The B5000 machines were designed from the ground up to run ALGOL, which required recursion. They also had one of the first descriptor-based architectures, which laid groundwork for capability architectures.
As far as I know, it was the PDP-6 (which grew into the DEC-10) that popularized call stack hardware, when the hacker community at MIT took delivery of one and discovered that the PUSHJ (Push Return Address and Jump) operation allowed the decimal print routine to be reduced from 50 instructions to 10.
The most basic function call semantics in a language that allow recursion require capabilities that match nicely with a stack. If that's all you need, then a basic stack is a good, simple match. If you need more than that, then your data structure has to do more.
The best example of needing more that I have encountered is the "continuation", the ability to suspend a computation in the middle, save it as a frozen bubble of state, and fire it off again later, possibly many times. Continuations became popular in the Scheme dialect of LISP, as a way to implement, among other things, error exits. Continuations require the ability to snapshot the current execution environment, and reproduce it later, and a stack is somewhat inconvenient for that.
Abelson & Sussman's "Structure and Interpretation of Computer Programs" goes into some detail on continuations.
Best Answer
Many stack based languages use
roll
as well, which is a generalizedrot
on an arbitrary number of elements in the stack. They also implement the reverse operation for rotating the stack the other way.I would say that
roll
is more fundamental thanrot
.I have no answer though about Turingness of this.