The mobile phone charger is a power conversion circuit which changes your power line voltage (110 or 220V) into something that is useful for your mobile phone (probably 5V). To do this it needs to have some electronic circuity inside which has to be powered and it has to function even if there is not phone around so it can detect one when you connect it.
The charger could be merely a mechanical device like the power socket itself but it would then require all the charging circuity to be inside your phone. Unfortunately it is quite big and relatively heavy so it would be inconvenient to carry it around all the time.
Regarding the actual 30mW figure: if instead of mW you consider the currents involved you arrive at around 300μA (30mW at 100V). This also means a resistance of \$330\,\mathrm{k\Omega}\$. It is quite difficult to work using resistances higher than and currents lower than this while still having to sense the moment when somebody plugs the actual load.
OTOH 30mW is really, really small. The vampire current draw problems are not as important as many believe. If you want a good review of many aspects of this then I suggest reading "Sustainable Energy – without the hot air", especially the chapter on this topic
Am I correct that the faster processor draws more power (and thus
dissipates more heat) under a computational load?
Not necessarily. There are two major components of power dissipation - static power (the power you burn when the chip is on) and dynamic/switching power (the power burned when the clock is running). While running the same chip at a higher frequency will result in more power dissipation, a chip may have a static power dissipation that is too high when combined with the faster clock rate to meet the bin requirements for the faster rating.
If so, is the power under computational load approximately
proportional to the rated/clocked frequency? In other words, inasmuch
as the one processor is clocked 8 percent faster than the other, does
it run about 8 percent hotter under load? Another way to ask the same
question is to ask: does each processor process about the same
quantity of data per unit of energy? or, if battery powered, can each
accomplish about as much before its battery dies?
For a given chip running identical calculations, the dynamic portion of the power consumption will be proportional to the clock frequency. The total power dissipation of the processor will increase a bit less than 8% for an 8% increase in clock frequency due to the static power dissipation.
When not under load, do the two processors idle equally cool, or are
there practical or theoretical factors that make the one idle cooler
than the other?
If you had two identical chips idling, the one with the lower clock frequency would dissipate less power. When the chips are idling, the static power becomes a much larger portion of the active power dissipation, so any differences there would be more noticeable.
Even if the processor's price were not determinate, might one prefer
the slower processor merely for the sake of cooler operation and
extended battery life?
Possibly, but again, you have a lot less of a guarantee that this would be the case. If you bought chips with different rated TDPs, then you could safely make this argument. Otherwise, you're at the mercy of the binning algorithm and the consistency of the manufacturer's process. Also, note that we're talking about power dissipation, not energy consumption. A faster processor may be able to complete a computationally heavy task faster, and switch to a low power idle mode sooner than a slower processor.
Would the answers differ for embedded processors?
Yes. The static power dissipation is most significant on the bleeding edge processes that Intel, TSMC, IBM, and Global Foundries use. Embedded processors are often optimized for low static power dissipation and use larger processes where static power dissipation is a much smaller portion of power dissipation. The variation at those larger process nodes is much less, so microcontrollers are much less susceptible to variation in power and frequency performance.
Best Answer
If you have a processor which can operate at two frequencies when it isn't idling, say f1 and f2, then there will be a different power consumption per frequency, as explained in other answers here.
The power consumption depends on the frequency in a non-linear fashion, so you might have:
f1 100MHz 1W
f2 200MHz 2.5W
If you have to execute 100 million instructions and the processor can do one instruction per clock cycle, you can do it at f1 or f2:
energy used at f1 = 100M instructions/100MHz / 1 (instruction/cycle) * 1W = 1J energy used at f2 = 100M instructions/200MHz / 1 (instruction/cycle) * 2.5W = 1.25J
So at f2 the execution is completed in 0.5s instead of the 1s at f1, but it took more energy.
However, there are other considerations in a computer system: for example, if you can get a disk drive into an idle state sooner because the processing has finished then the savings from the disk drive power consumption may be greater than the extra energy used in the processing. Another example: if the user can finish their work in half the time, they can shut down the computer and save on energy used to run the monitor.