Turns out I missed a crucial detail in the accompanying text, and the registers are indeed composed out of two (master-slave) sub-registers:
The Use of Master–Slave Registers
Note that the contents of the PC are
incremented within the same clock pulse. As a direct consequence, the
PC must be implemented as a master–slave flip–flop; one that responds
to its input only during the positive phase of the clock. In the
design of this computer, all registers in the CPU will be implemented
as master–slave flip–flops.
The short answer then is "It depends." Different processor families will use different approaches. There isn't a one idea fits all micros answer. Also, synchronous interrupts (those generated by internal hardware sync'd to some internal clock) may be acknowledged more predictably and differently than asynchronous events (external to the micro.) As always, read the datasheet and keep an open mind.
Calculating precise timing will depend upon the processor, as well. Even if an event (timer counter match, external, whatever) occurs, if your processor is busy executing an instruction for several cycles, then it usually won't abort the instruction in order to start an interrupt routine. (But some processors WILL interrupt SOME instructions, even though I just said they usually don't. So even that isn't gospel and you have to read the datasheet and family guides to be sure.) Worse, even if the processor is executing a single cycle instruction, there may be some variability in the interrupt response. So you might find the docs saying "5 to 6 cycles later," for example. So even then, you aren't exactly sure.
On the other hand, some processors are as predictable as an atomic clock. The Analog Devices ADSP-21xx processor (single cycle for EVERY instruction word -- some of which can be performing three instructions in parallel), ALWAYS has exactly the same interrupt response time, every time, to an timer event. You can almost set your atomic clock by it. No variation, at all. Just clean, perfect, predictable responses. Every time.
But that's rare.
And if your processor has a nice, long pipeline, you might have to wait for it to "drain" out. That's done so that there isn't a lot of internal state that needs to be restored, restarting multiple instructions in various stages of execution.
But even then, there are exceptions. The DEC Alpha might take some clocks just to get to your interrupt routine. But when it does, it will probably have several instructions at various states in the various pipelines -- all of which are just sitting there waiting to continue. Your interrupt code will need to save all that state, do its thing, restore that state and then restart ... with all the pipelines back where they were when interrupted. The interrupt code, if it needs to track back and find a faulting instruction, is PAINFUL to write. But that thing screamed. They wouldn't even do lane changes for byte selection because it would add a combinatorial delay which would reduce the clock rate.
So, that's rare. But yes, even that can happen.
SO...... READ THE DATASHEET AND FAMILY MANUAL. And be prepared for ANYTHING at all. Designers can be VERY CREATIVE at times.
Best Answer
To answer such a question a lot more context must be given and assumptions must be made explicit. Just a few issues:
1) The call-method you describe here is typical for ARM/Cortex and some less known architectures. An 8085 uses the more common stack based method.
2) Most architectures have dedicated hardware and data paths for incrementing the PC, so the ALU does not need to be involved, and it can be done in parallel with another operation.
3) An 8085 is an 8-bit architecture with a 16-bit address, hence getting an address from memory involves two memory accesses (with accompanying PC increments).
4) You seem to assume that a memory access takes 2 internal cycles worth of time. IIRC it was 1 for an 8085 (but I might be wrong), and it is often many many more for modern processors.
5) In step 3) you mention an accumulator, you probably mean the ALU result register, which on most register-based architectures is not a programmer-visible register.
6) If storing the result in Rn takes a cycle, it seems reasonable to assume that storing the destination address in the PC also takes a cycle.