Electronic – How x86-64 Intel CPU understands how many bytes load into a register

assemblycachecpuintelmemory

I have the following byte code one the left and and its byte representation on the right:

mov    eax, 0x1    ; 0:  b8 01 00 00 00          
mov    ebx, 0x2    ; 5:  bb 02 00 00 00          
add    eax, ecx    ; a:  01 c8                   

I am not sure that I understand correctly how CPU loads this byte code from cache to registers.

Here is my vision of the process. Let's assume that this byte code already in L1 cache.

As far as I understand:

  1. CPU reads single byte b8 and understands that this opcode means for him that he needs to load to the EAX register very next 32 bits [quick question: does CPU load this byte to the register or not?]
  2. Therefore, CPU loads from cache to EAX next 4 bytes 01 00 00 00
  3. CPU reads single byte bb and understands that this opcode means for him that he needs to load to the EAX register next 32 bits
  4. Therefore, CPU loads from cache to EBX very next 4 bytes 02 00 00 00
  5. CPU reads single byte 01 and understands that he needs to read one more byte to figure out what registers should be sumed.
  6. So he reads the very next byte c8 and understands that he needs to sum values in EAX and EBX registers

If my idea is correct it means that CPU reads from cache not only 32 bit word per read, but although 1 byte per read. Am I correct?

If not, please provide me explanation how CPU executes this hex code.

Best Answer

Most real world CPUs today fetch multiple bytes from the instruction cache at the same time (many cases, up to a full cache line, which is 64 bytes long). Multiple parallel decode engines then figure out the instruction boundaries of 1 or more instructions in that single cache read and proceed to issue several instructions in the downstream stages of scheduling and execution.

In other words, cache reads, decode, issue, scheduling and execution happens for multiple instructions at any given time for most superscalar processors (which most of mainstream processors are).