Turns out I missed a crucial detail in the accompanying text, and the registers are indeed composed out of two (master-slave) sub-registers:
The Use of Master–Slave Registers
Note that the contents of the PC are
incremented within the same clock pulse. As a direct consequence, the
PC must be implemented as a master–slave flip–flop; one that responds
to its input only during the positive phase of the clock. In the
design of this computer, all registers in the CPU will be implemented
as master–slave flip–flops.
The external data-bus width doesn't always agree with the processor's internal structure. A well-known example is the old Intel 8088 processor, which was identical to the 16-bit 8086 internally, but had an 8-bit external bus.
Databus width is not a real indicator of the processor's power, though a less wide bus may affect data throughput. The actual power of a processor is determined by the CPU's ALU, for Arithmetic and Logic Unit. 8-bit microcontrollers will have 8-bit ALUs which can process data in the range 0..255. That's enough for text processing: the ASCII character table only needs 7 bits. The ALU can do some basic arithmetic, but for larger numbers you'll need software help. If you want to add 100500 + 120760 then the 8-bit ALU can't do that directly, not even a 16-bit ALU can. So the compiler will split numbers to do separate calculations on the parts, and recombine the result later.
Suppose you have a decimal ALU, which can process numbers up to 3 decimal digits. The compiler will split the 100500 in 100 and 500, and the 120760 into 120 and 760. The CPU can calculate 500 + 760 = 260, plus an overflow of 1. It takes the overflow digit and add that to the 100 + 120, so that the sum is 221. It then recombines the two parts so that you get the final result 221260. This way you can do anything. The three digits were no objection for processing 6 digits numbers, and you can write algorithms for processing 10-digit number or more. Of course the calculation will take longer than with an ALU which can do 10-digit calculations natively, but it can be done.
Any computer can simulate any other computer.
The humble 8-bit processor can do exactly what a supercomputer can, given the necessary resources, and the time. Lots of time :-).
A concrete example are arbitrary precision calculators. Most (software) calculators have something like 15 decimal digits precision; if numbers have more significant digits it will round them and possible switch to mantissa + exponent form to store and process them. But arbitrary precision expand on the example calculation I gave earlier, and they allow to multiply
\$ 44402958666307977706468954613 \times 595247981199845571008922762709 \$
for example, two numbers (they're both prime) which would need a wider databus than my PC's 64-bit. Extreme example: Mathematica gives you \$\pi\$ to 100000 digits in 1/10th of a second. Calculating \$e^{\pi \sqrt{163}}\$ \$^{(1)}\$ to 100000 digits takes about half a second. So, while you would expect working with data wider than the databus to be taxing, it's often not really a problem. For a PC running at 3 GHz this may not be surprising, but microcontrollers get faster as well: an ARM Cortex-M3 may run at speeds greater than 100 MHz, and for the same money you get a 32-bits bus too.
\$^{(1)}\$ About 262537412640768743.99999999999925007259, and it's not a coincidence that it's nearly an integer!
Best Answer
See diagrams below to make sense of data flow etc.
– MPC: Address of next microinstruction to be fetched from memory.
– MIR: Current microinstruction whose bits drive control signals of data path
The question seems to be fundamentally wrong in a statement it makes BUT this may be a language issue - see below.
MIR is NOT loaded FROM MPC (as you say).
MPC is a pointer to the control store and MIR is loaded from the location that MPC points to.
I cannot be 100% sure that I am making sense of your question but if I am then what you suggest is incorrect. You ask -
If I follow what you are asking then the opposite of what you ask is what happens.
MPC address is latched in by rising system clock
MPC output stabilises during clock high.
MPC now addrses control store so that control store output stabilises by end of system clock less any setup time that MIR may require.
Falling system clock latches control store data into MIR.
Cycle procedes - see below.
SO to the question
I would answer , No! - MIR register is loaded from the control store (not from MPC) on the falling clock edge AFTER the store output has gone stable which occurs AFTER MPC goes stable which occurs somewhere during clock high.
See below.
BUT following through the following timing should answer it.
Say MIR is loaded by time t1.
(1) Once MIR is loaded the control signals from it propagate asynchronously out onto the data path.
ALU function and data inputs are arranged to be stably set prior to its output being required to be used. This involves two inputs to ALU to be selected by signals from MIR and ALU function also.
(2) Say ALU is stably addressed and data fed and ALU output ready for shifter by t1 + t2.
(3) ALU and shifter then do their thing with output by t1 + t2 + t3.
(4) ALU output is now stored stably back into registers by t1 + t2 + t3 + t4.
This provides next microinstruction address for MPC which outputs control store code for MIR which provides new set of microinstructin bits - cycle repeats.
The above diagram is from page 12 (I think frome here
To the above add the following diagram.
They have used w x y z where I used T1 2 23 4 - you can clearly see the propagation from the cycle triggering clock edge.
The register outputs from the old cycle are loaded on the rising clock edge and MPC is addressed with clock high as the address bots stabilise. MPC becomes valid somewhere in the clock high time. The control store is asynchronously addressed by stabilising MPC and control store output data must be stable by clock fall time (less any setup time required by MIR) so that MIR is loaded from control store on the clock falling edge. The cycle then follows through as above and as per times shown for colours for w x y z below.
The above diagram is slide 6 from here.
Useful references:
THE MICROARCHITECTURE LEVEL
EENG4320 COMPUTER ARCHITECTURE
U of T at Tyler Here
The Microarchitecture Level
Wolfgang Schreiner
Research Institute for Symbolic Computation (RISC)
Johannes Kepler University, Linz, Austria
here
Wolfgang's Page
The Microarchitecture Level - lies between digital logic level and ISA level uses digital circuits to implement machine instructions instruction set can be: implemented directly in hardware (RISC) interpreted by microcode (CISC)
http://www.ics.uci.edu/~bic/courses/51%20ICS/Lectures/ch4-all.pdf
Christmas Tree's Machine
Mic-1 Datapath and Control