Linear regulators like the 7805 are inefficient, and more so when the input voltage is higher. It works as a variable resistor, which varies its value to keep the output voltage constant, here 5V. That means that the current consumed by your 5V circuit also flows through this variable resistor. If your circuit dissipates 1A then the power dissipation in the 7805 will be
\$ P = \Delta V \cdot I = (9V - 5V) \cdot 1A = 4W \$
4W in a single component is rather much, the 5W in your circuit will probably be distributed over several components. It means that the 7805 will need a heatsink, and that's most often a bad sign: too much power dissipation. This will be worse with higher input voltages, and the efficiency of the regulation can be calculated as
\$ \eta = \dfrac{P_{OUT}}{P_{IN}} = \dfrac{V_{OUT} \cdot I_{OUT}}{V_{IN} \cdot I_{IN}} = \dfrac{V_{OUT}}{V_{IN}}\$
since \$I_{OUT} = I_{IN}\$.
So in this case \$\eta = \dfrac{5V}{9V} = 0.56 \$ or 56%. With higher input voltages this efficiency will even get worse.
The solution is a switching regulator, or switcher for short. There are different types of switcher depending on the \$V_{IN}/V_{OUT}\$ ratio. If \$V_{OUT}\$ is less than \$V_{IN}\$ you use a buck converter.
While even an ideal linear regulator has a low efficiency, an ideal switcher has a 100% efficiency, and actual efficiency can be predicted by the properties of used components. For instance there's a voltage drop over the diode, and resistance of the coil. A well designed switcher may have an efficiency as high as 95%, like for the given 5V/9V ratio. Different voltage ratios may result in somewhat lower efficiencies. Anyway, 95% efficient means that the power dissipated in the regulator is
\$ P_{SWITCHER} = \left(\dfrac{1}{\eta} - 1\right) \cdot P_{OUT} = \left(\dfrac{1}{0.95} - 1\right) \cdot 5W = 0.26W \$
which is low enough not to need a heatsink. As a matter of fact the switching regulator itself may be in a SOT23 package, with the other components, like coil and diode SMDs as well.
Basically, your problem is that one impulse of energy through the inductor/capacitor is far more than you need to keep the output voltage "in regulation". You have chosen to short out the current limit resistor but, this is fundamental to controlling the energy per pulse.
Energy per pulse is I^2*L/2 and you have no control on what I limits at. The power delivered to the load is this energy multiplied by the number of times you pulse per second.
Because it is only "doing" 500Hz it certainly is in discontinuous mode and this, given what I've just said, shouldn't surprise you any more. You need to choose a value of current limit resistor that stops too much energy being delivered to the load.
You might also try reducing the value of L to about 100uH as well - this reduces the energy per pulse by a factor of 6.8.
Best Answer
You don't want a switcher. Your input-output difference is so low (for most of the discharge curve the voltage is less than 3.8 V) that even at 90 % efficiency it won't do better much than an LDO.
Have a look at the Seiko S1167:
edit
Found an even better one in the S1313:
Seiko doesn't give data for ground current under load, but in my experience you should count on 1 % of load current, so at 1 mA that would be around 10 µA, most likely less than 50 µA.