DRAM, as you said, basically consists of a storage capacitor and a transistor to access the voltage stored on that capacitor. Ideally, the charge stored on that capacitor would never decrease, but there are leakage components that allow the charge to bleed off. If enough charge bleeds off the capacitor, then the data cannot be recovered. In normal operation, this loss of data is avoided by periodically refreshing the charge in the capacitor. This is why it is called Dynamic RAM.
Decreasing the temperature does a few things:
- It increases the threshold voltages of MOSFETs and the forward voltage drop of diodes.
- It decreases the leakage component of MOSFETs and diodes
- It improves the on-state performance of the MOSFETs
Considering that the first two points directly reduce the leakage current seen by the transistors, it should be less of a surprise that the charge stored in a DRAM bit can last long enough for a careful reboot process. Once power is reapplied, the internal DRAM system will maintain the stored values.
These basic premises can be applied to many different circuits, such as microcontrollers or even discrete circuits, as long as there isn't an initialization on start-up. Many microcontrollers, for example, will reset several registers on start-up, whether the previous contents were preserved or not. Large memory arrays are not likely to be initialized, but control registers are much more likely to have a reset on start-up function.
If you increase the temperature of the die hot enough, you can create the opposite effect, of having the charge decay so rapidly that the data is erased before the refresh cycle can maintain the data. However, this should not happen over the specified temperature range. Heating the memory hot enough for the data to decay faster than the refresh cycle could also cause the circuit to slow down to the point where it couldn't maintain the specified memory timings, which would appear as a different error.
This is not related to bit-rot. Bit-rot is either the physical degradation of storage media (CD, magnetic tapes, punch cards) or an event causing the memory to become corrupted, such as an ion impact.
Your capacitor misconceptions aside, it is possible to deliver power by effectively one wire. Driving an antenna is an example of one wire power delivery. Power is delivered to the antenna via the transmitter and the antenna radiates the energy as an electromagnetic wave.

Best Answer
In both cases (EEPROM/flash and DRAM) a small (femtofarads) capacitor is used. The difference is the way the capacitor is connected.
In the case of DRAM it is connected to the source or drain of a MOSFET. There is a tiny bit of leakage through the transistor channel and the charge will leak off in a relatively short period of time (seconds or minutes at room temperature). Generally the cells are specified to be refreshed every 64ms, so even at high temperature the data is reliably held. Reading the data is usually destructive so it needs to be re-written after every read.
In the case of a flash or EEPROM cell as used to store configuration data, the capacitor is connected to the gate of a MOSFET. The insulation of the gate/capacitor is very close to perfect and the tiny charge will hold for many years, even at high temperature. The disadvantage is that some method such as quantum tunneling must be used to change the charge on the "floating gate", and that is a much slower process, far too slow to be practical for working memory. Reading is fast and non-destructive, at least in the short term. Using tunneling exposes the gate insulator to a relatively high voltage gradient and exposes failure modes wherein the cell will effectively wear out after a number of writes (typically specified as 10^3 to 10^6 or more).