You can do a little better than this; because burst transfers only use the data paths while in progress, you can overlap bursts with address signal transfers.
So you can get ready to transfer the next burst while the current burst is in progress; likewise you can open the next bank and set RAS for it before the current bank's transfer is done. Start the next actual transfer, then come back to precharge this bank.
It's more complex, and you'd have to read the Spartan-6 MCB docs in case they don't allow this stuff; I was rolling my own controller when I did this.
In any case it sounds like you won't need it, but its nice to know it's there.
A bigger problem is that it will want to stop every 8 us and spend a chunk of time generating a refresh pulse (and precharges around it). I could tell it not to in my own core, (until a convenient break, but no longer than 70 us) and later added a similar hack to the Virtex-5 MIG core for this purpose but I don't think you can control refreshes on Spartan-6. So, if this is a problem, you'll need an elastic buffer somewhere to take up the slack.
I wouldn't say it is a bug, it is more of a limitation, and in a way it makes total sense. You want it to infer a dual port ram, the compiler wants to infer a dual port ram, however the process in the problem snippet does not properly describe the address input of the read port of the ram, because not all paths are covered, so it would have to infer a latch, while what you really want is a don't care. So you are basically making it hard for the compiler.
Realize that to infer the ram, it must also infer a few signals and their values, one of which is the address input. In the synchronous process, the value for this inferred signal is not defined for the case that ends up in PIXOUT <= x"111". So it would have to infer a latch and the warning would be an awkward "inferring latch on inferred address signal of inferred ram". It ends up being a bit too much, so it probably gives up, but then alternative solution does not fit the device. I'm not saying that this is the exact reason why it is giving up, but it should be clear that the compiler would have difficulties filling in the blanks for the inferred ram, given the way this process was coded.
All the solutions that work cover all the cases for the address signal, inferred or not. You could test declaring the signal and then coding it the same way so that it has to infer a latch, and it may even compile because now at least would have an explicitly declared signal it can refer to.
The coding styles for inferring ram blocks for Altera can be found in http://www.altera.com/literature/hb/qts/qts_qii51007.pdf#page13
This is one of the reasons why it is strongly suggested that if you wish to use existing hardware blocks, either instantiate them as such to avoid ambiguities, or follow the appropriate coding styles so that it can be easily and adequately inferred.
Also note that a simulator would not have to deal with this, because it would not have to infer a ram block per se and deal with the ambiguous don't care vs latch of this phantom signal, because in simulation it does not exist.
Best Answer
If you have a working DRAM-Controller, there are several possibilities. Instantiate a NIOS-II with debugging-option, and you will be able to even debug using Eclipse CDT. If I remember correctly, there is also an Avalon-MM master that lets you peek into memory from System Console.
If however your design is the Controller and you want to debug it, there is no option to access the memory by other means, bypassing your design.
BTW: Using QSYS does not force you to start over, you can embed a Verilog component in a QSYS-project and vice versa. I prefer to have a QSYS-component as top-level entity.