Just went over some slides and noticed that the L1 cache (at least on Intel CPUs) distinguishes between data and instruction cache, I would like to know why this is..
Why are there separate L1 caches for data and instructions
cpuhardware
Related Solutions
The I/O schemes you are describing are in current use in computers.
why the CPU actually has to stay there, practically not doing anything else than just waiting for IO?
This is the simplest possible I/O method: programmed I/O. Many embedded systems and low/end microprocessors have only a single input instruction and a single output instruction. The processor must execute an explicit sequence of instructions for every character read or written.
but it should be made possible for the cpu to wait or to check regularly, while actually performing lots of other tasks and only going back to the IO process when it's ready
Many personal computers have other I/O schemes. Instead of waiting in a tight loop for the device to become ready (busy waiting), the CPU starts the I/O device asking it to generate an interrupt when it's done (interrupt driven I/O).
Although interrupt-driven I/O is a step forward (compared to programmed I/O), it requires an interrupt for every character transmitted and it's expensive...
Like for instance there could be some kind of mini cpu which would just wait for it and deliver the small part of data to the real cpu as soon as it gets back to the process and so the process would be repeated and we wouldn't have to practically dedicate a whole cpu core for the data copy process...
The solution to many problems lies in having someone else do the work! :-)
The DMA controller/chip (Direct Memory Access) allows programmed I/O but having somebody else do it!
With DMA the CPU only has to initialize a few registers and it's free to do something else until transfer is finished (and an interrupt is raised).
Even DMA isn't totally free: high speed devices can use many bus cycles for memory references and device references (cycle stealing) and the CPU has to wait (DMA chip always has a higher bus priority).
I/O wait is 12.1%. This server has 8 cores (via cat /proc/cpuinfo). This is very close to (1/8 cores = 0.125)
I think this is from: Understanding Disk I/O - when should you be worried?
Well it isn't strange: the system (mySQL) must fetch all the rows before manipulating data and there aren't other activities.
Here there isn't a computer architecture / OS issue. It's just how the example is set.
At most it could be a RDBMS tuning problem or a SQL query problem (missing index, bad query plan, bad query...)
Best Answer
There are actually several reasons.
First and probably foremost, the data that's stored in the instruction cache is generally somewhat different than what's stored in the data cache -- along with the instructions themselves, there are annotations for things like where the next instruction starts, to help out the decoders. Some processors (E.g., Netburst, some SPARCs) use a "trace cache", which stores the result of decoding an instruction rather than storing the original instruction in its encoded form.
Second, it simplifies circuitry a bit -- the data cache has to deal with reads and writes, but the instruction cache only deals with reads. (This is part of why self-modifying code is so expensive -- instead of directly overwriting the data in the instruction cache, the write goes through the data cache to the L2 cache, and then the line in the instruction cache is invalidated and re-loaded from L2).
Third, it increases bandwidth: most modern processors can read data from the instruction cache and the data cache simultaneously. Most also have queues at the "entrance" to the cache, so they can actually do two reads and one write in any given cycle.
Fourth, it can save power. While you need to maintain power to the memory cells themselves to maintain their contents, some processors can/do power down some of the associated circuitry (decoders and such) when they're not being used. With separate caches, they can power up these circuits separately for instructions and data, increasing the chances of a circuit remaining un-powered during any given cycle (I'm not sure any x86 processors do this -- AFAIK, it's more of an ARM thing).