Electronic – What does banking mean when applied to registers

computer-architectureregister

This answer to a question on StackOverflow about what banking means in the context of ARM's banked registers indicates that there is some confusion about the meaning of banking when applied to registers.

What does banking mean with respect to registers?

Best Answer

The word banking is used in two different senses when applied to registers.

Banked Registers for Interrupt Handling

The sense with which the StackOverflow question is concerned is similar to the use in (memory) bank switching (used by some 8-bit and 16-bit processors) in function. The names of a collection of registers are mapped to a different collection of physical registers. ARMv7 provides one extra bank for 7 of its 16 GPRs and five more banks for the stack pointer register and link register (ARM is uses the link register to save the PC to be used for returning from the interrupt). Itanium provides one extra bank for 16 of its 31 static GPRs. (MIPS provides entire sets of 31 GPRs, calling them "shadow register sets".)

Unlike memory bank switching, the primary purpose of this type of register banking is (typically) not to extend addressable storage but to provide faster interrupt handling by avoiding the need to save register values, load values used by the interrupt handler, and restore the original register values and to simplify interrupt handling.

(Using the application's stack to save register state opens the possibility of overflowing the memory allocated for this stack, generating an exception which must then handle state saving somehow. Worse, if the page of memory immediately past the limit of the stack is writeable by the escalated privilege of the interrupt handler but not by the application, then the application is effectively writing to a page to which it does not have write permission. Some ABIs avoided this issue by defining one or more registers as volatile across interrupts. This allows the interrupt handler to load a pointer for state saving without clobbering application state, but unlike banked registers such software-defined interrupt volatile registers cannot be trusted to be unchanged by application software.)

(Using such banks of registers as fixed windows has been proposed to extend the number of registers available, e.g., "Increasing the Number of Effective Registers in a Low-Power Processor Using a Windowed Register File", Rajiv A. Ravindran et al., 2003. One might also note a similarity to register stack used to avoid register save and restore overhead for function calls as in Itanium and SPARC [which uses the term "register windows"], though these mechanisms typically shift the register names rather than swapping them out.)

In terms of hardware, banked registers can be implemented by renaming the registers in instruction decode. For ARM's relatively complex banking system this would probably be the preferred mechanism. For a simpler banking system like that used by Itanium with a single extra bank with power of two number of registers, it may be practical to incorporate the renaming into the indexing of the register file itself. (Of course, this would not be compatible with certain forms of renaming used to support out-of-order execution.)

By recognizing that different banks are not accessed at the same time, a clever optimization using this mechanism can reduce the (wire-limited) area overhead of a highly ported register file by using "3D registers". (This technique was proposed in the context of SPARC's register windows — "A Three Dimensional Register File For Superscalar Processors", Tremblay et al., 1995 — and a variant was used by Intel for SoEMT — "The Multi-Threaded, Parity-Protected 128-Word Register Files on a Dual-Core Itanium-Family Processor", Fetzer et al., 2005.)

Banking to Increase the Number of Possible Accesses

The second sense in which the term banking is used for registers refers to the splitting of a set of registers into groups (banks) each of which can be accessed in parallel. Using four banks increases the maximum number of accesses supported by a factor of four, allowing each bank to support fewer access ports (reducing area and energy use) for a given effective access count. However, to the extent that accesses in a given cycle are not evenly distributed across banks, the maximum number of accesses will not be achieved. Even with a large number of banks relative to the desired access count, bank conflicts can, in the worst case, limit the actual access count to the number of ports provided by a single bank.

There have been many academic papers on banked register files (Google Scholar search), and several general techniques have been proposed to reduce the impact of bank conflicts. The most obvious technique is to buffer instructions (as is done for out-of-order execution) providing some statistical averaging of bank conflicts. It is also possible to read a register operand before the instruction is ready to execute (e.g., if another operand is not yet ready or a structural hazard delays execution). Allocation of registers to banks can exploit information about expected use to reduce the probability of conflicts. (Software can assist by preferentially using registers in the expected manner.) Using virtual physical register names, it is possible to delay allocation of physical register names (and thus banks) until the value is stored in the register; this facilitates avoiding conflicts on the writes and may facilitate clever bank allocation to avoid read conflicts.

This type of banking is sometimes called pseudo-multiporting since it provides the illusion of a larger number of access ports. This technique is commonly used for caches since the physical structure is often partitioned into separate memory arrays for other reasons.

(One alternative to such banking is replicating the register file. Using two copies of the register file allows each copy to require half as many read ports, though the same number of write ports are required. This technique was used in POWER2 and the Alpha 21264 and is commonly used in high performance processors.)

Summary

It may be helpful to distinguish these two types of banking as temporal banking in which bank selection is spread across time (like ARM's banked registers for fast interrupts) and spatial banking in which bank access can be concurrent in time but is spatially distributed.

Temporal banking is typically exposed to software and is used to reduce the overhead (and complexity) of interrupts. (Conceptually, thread switching in a Switch-on-Event-MultiThreaded processor is very similar to interrupt handling and can use similar mechanisms to reduce overhead.)

Spatial banking is less frequently part of the ISA (though Itanium required load and store floating-point register pairs to use even and odd register numbers — which is not guaranteed given the use of register rotation — allowing a trivial two bank design to provide the extra register file access requirements) and is used to reduce the cost of providing a larger number of register accesses per cycle.