Well, I'm really not sure.
This CPU description is incomplete, for example there is no branch nor data memory access.
For the non-pipelined version, there is only one visible clock, the program counter. It is possible to calculate the next PC address while doing the ALU operations, propagation time is 6+3.5+4+4+1+6=24.5 ns
All the decoders (source, operand...) are parallel, so only the longest delay is part of the timing "critical path". There is no clear indication of the delay needed for writing into the register file, maybe 4 ns more.
For the pipelined version :
F : I-cache : 6ns ( in parallel with the program counter update )
ID1 : Instruction decoder : 3.5ns
ID2 : Destination decoder : 4ns ( parallel with the other decoders, which are faster) + register file : 4ns
EX : ALU + MUX : 7ns
WB : register update : ??? ns
Max delay is around 8ns.
Alternatively, all decoding in ID1 (so 7.5ns) and register access in ID2 (4ns). Traditionally, the ALU is part of the EXECUTE stage.
Anyway, I think that this exercise is really poorly written.
Ultimately I think the question is flawed. And I do NOT get close to any answer. Hard to get 22k saved from 32K, when nanocode takes up 14.4K.
From 1.
In some cases, such as the Motorola 68000, there is also a nanocode engine. The 68000 uses 544 17-bit words in its microengine and 336 68-bit words in its nanocode engine. It thus has 32,096 bits of ROM. If everything had been some with 68-bit words, it would have required 36,992 bits.
Memory was expensive with Complex Instruction Set Computers, so microcode executed multiple instructions [Inc Memory] (Read Memory, Inc Register, Store Memory). To decrease microcode, Motorola implemented nanocode, which microcode called.
544 words × 17-bit microcode words + 336 word × 68-bit nanocode words = 32,096 bits of ROM.
544 words × 68-bit microcode words = 36,992 bits of ROM.
36,992 - 32,096 = 4,896 bits saved.
$$log_2 336 = 6.63$$
To represent the 336 nanocode words, 7 bits are required. Actually have 17. This makes sense, since similar microcode or nanocode can use don't care states to select different operations.
From 2 slide 11 notes.
There are n= 2048 words that are each 41 bits wide, giving an area complexity of 2048 × 41 = 83,968 bits.
The unique microwords (100 for this case) form a nanoprogram, which is stored in a ROM that is only 100 words deep by 41 bits wide
$$log_2 100 = 6.64$$
7 bits minimum are needed in microcode to access nanocode.
The microprogram now indexes into the nanostore. The microprogram has the same number of microwords regardless of whether or not a nanostore is used, but when a nanostore is used, pointers into the nanostore are stored in the microstore rather than the wider 41-bit words. For this case, the microstore is now 2048 words deep by bits wide. The area complexity using a nanostore is then 100 × 41 + 2048 × 7 = 18,436 bits, which is a considerable savings in area over the original microcoded approach.
18,436 bits vs. 83,968 bits. Significant savings.
Same methodology. Microcode indexes into nanocode. Microcode is saved, but nanocode ROM must be added to determine true savings. True savings = 9,152 bits.
9.1k is significantly less than 22k. But 450 words × 32-bit = 14,400. Hard to get 22k saved from 32K, when nanocode takes up 14.4K. No answer is correct. Hence my assertion that the question is flawed in some way.
As per Maryam Ghizhi comments below:
1024 × 32-bit - 1024 × 9-bit = 23,552 bits or 23kbits (saved).
Savings are microcode. So 32 bits - 9 bits = 23 bits × 1024 words = 23kbits.
This is one of the answers on the list. It is saved from microcode (ignoring nanocode). Final Answer: 2.
Within the parameters of the question, since 9 bits are required to address 450 nanocode locations, there is no way to get to 22Kbits of microcode saved.
Edit...
From 3, where it appears you have asked this equation before (along with a bounty):
in digital system with micro-programmed control circuit, total of distinct operation pattern of 32 signal is 450. if the micro-programmed memory contains 1K micro instruction, by using Nano memory, how many bits is reduced from micro-programmed memory?
The from micro-programmed memory is a very important component of question, which makes the correct answer "2".
I'd focus on why the correct answer is not 2. Not why your notes says the correct answer is 1. Like I said, flawed question.
Best Answer
The memory is byte addressable, and the 16-bit bus transfers two bytes at a time.
But the total capacity is still just 1 MB.
The diagram is a bit misleading in this regard — it should show only A19-A1 going to both banks. A "bank" refers to 8-bit wide memory, and one or both banks are accessed as described in the table. They're trying to show that the left bank (D15-D8) is enabled only when BHE- is low, and the right bank (D7-D0) is enabled only when A0 is low.
So yes, when stepping through memory (such as when prefetching instructions), the bus address increments by two.
However, note that 8086 instructions are anywhere from 1 to 6 bytes long, and that's what determines the actual IP increment. The 8086 reads instructions from a 6-byte prefetch buffer (4 bytes on the 8088), and the logic that fills that buffer uses its own address register that is distinct from the IP.