To answer such a question a lot more context must be given and assumptions must be made explicit. Just a few issues:
1) The call-method you describe here is typical for ARM/Cortex and some less known architectures. An 8085 uses the more common stack based method.
2) Most architectures have dedicated hardware and data paths for incrementing the PC, so the ALU does not need to be involved, and it can be done in parallel with another operation.
3) An 8085 is an 8-bit architecture with a 16-bit address, hence getting an address from memory involves two memory accesses (with accompanying PC increments).
4) You seem to assume that a memory access takes 2 internal cycles worth of time. IIRC it was 1 for an 8085 (but I might be wrong), and it is often many many more for modern processors.
5) In step 3) you mention an accumulator, you probably mean the ALU result register, which on most register-based architectures is not a programmer-visible register.
6) If storing the result in Rn takes a cycle, it seems reasonable to assume that storing the destination address in the PC also takes a cycle.
Your interpretation is correct.
In terms of the line:
"the total is 4 words x 4 bytes = 16 bytes which is 128 bits"
That is the total size of the memory. Each word in that memory is 32 bits, and you have 4 of them giving a total size of 128 bits.
Best Answer
'Word length' is a fairly loose concept whose meaning has varied somewhat over the course of computer design history, depending on what was important, or what was seen as a performance limitation at the time.
Unfortunately, in your question, you've used the terms 'memory system', 'computer architecture', and 'ALU'. Each can define word length, and each can be different, though they often will align.
The most common measure of word length the width of the internal data bus and multiplexers, so the registers and most importantly the ALU. The original 4004 and 8085 were thus 4 and 8 bit machines respectively. The addressable memory on these machines was much more than that, through paging, and use of double-width registers. This was continued into the 16 bit 8086 which could address a 20 bit memory space. Once properly addressable linear memory was seen as important, the move from 32 bit to 64 bit 'computers' (operating systems) was based on addressable memory space, even though by now internal registers were a mix of 32, 64, 80, 128 bits, and more.