Well if you were supplying the circuit from a DC supply it would take about 5 time constants before the cap could be regarded as fully charged. You could make the argument that an AC scenario would be no worse because at least the power comes intermittantly over a longer time period giving a little time for the resistor to cool so, stick with the DC circuit is my advice when analysing.
5TCs ia a total elapsed time period of about 17 seconds (on DC) so, you have to start digging around in the data sheets of power resistors to see how much their short term peak overload rating is.
For instance, a 1 watt rated resistor may be able to take 10 watts for 1 second or 100 watts for 0.1 seconds etc.. Regarding the current at any point in the charging process consider this graph: -
Along the base is time (measured in RC time constants). The rising graph tells you the capacitor voltage in terms of percentage fully charged voltage (500 V x 1.4142 and not 500 V).
The falling graph gives you percentage current and at 0.5 CR it is 61% of I max (which happens to be 0.7 A and not 0.5 amps as per the question). So, you could do a calculation based on approximate figures i.e. between t=0 and t=0.5RC you could assume the current to be 100% then, in the next half time constant assume it is 61% etc..
Or, you could do some digging around and develop the exponential formula to give a more precise figure for power.
Typically you would not like to switch big loads using mechanical switches, as they bring several problems. In particular, contact wearout due to inrush currents. This problem is exacerbated in presence of capacitive loads. When you turn-on a capacitive load, you'll get a large inrush current. Rugged contacts are needed to handle that current. This would be the case in which the capacitor is after the switch.
This is also the case the pdf you linked suggests. In fact, they want you to put that capacitor in parallel to the strip, so that the voltage at the strip will not increase abruptly when you connect that strip to the PSU (i.e. when you turn on the switch). With that big capacitor, the voltage at the strip will rise with a time constant, which is C * R, where C is 1000uF, and R is the wiring resistance + contact resistance + PSU output resistance (* see note).
If you don't put such large capacitor, the voltage at the strip could rise "instantaneously". This would create an inrush current on the strip, which could be very high. In fact, for each LED (see pdf) a 100nF decoupling capacitor is mounted. Since the LED density might be as high as 144 LEDs/meter, you would have 14.4uF per meter. Say you have 1 meter, so 14.4uF. If the voltage rises with a 14.4us time constant (this automatically implies a total parasitic resistance of about 1 Ohm. I guess this is too much), you'll have a spike with an initial inrush current (I=C * dV/dt) of Vdd*C/Tau = 5 * 14.4uF/14.4us = 5A. This is an additional current, which must be added to your load current, and could damage something (the strip traces).
If you put that additional 1000uF capacitor, the voltage will rise with a much slower rate (dv/dt). There will be also a much larger current spike, but not in the LED stripe, but between the PSU and the capacitor.
However, if you put that big capacitor AFTER the switch, the inrush current could damage over time:
1) the power supply;
2) the capacitor;
3) the switch (as a large spark will occur when contacts close).
If you want, consider using a pMOSFET/integrated Load Switch, where you can easily select the turn-on/turn-off times, then drive it with a smaller/cheaper switch. In that way, you can limit the inrush current. Just as an example, look at the application circuit of Si1865 http://www.vishay.com/docs/71297/si1865dl.pdf. Of course you must choose a load switch or mosfet that can handle that current (therefore the SI1865 won't suit your needs).
Leaving that big capacitor as it is now, it's even much more deleterious than not having it at all, because you're even reducing the PSU output impedance.
Notes:
* you must include the PSU output resistance when its output capacitor is much smaller than 1000uF.
Best Answer
This will not provide an answer on how to calculate the capacitor size, but I'll use the answer space to draw a small model of a NeoPixel strip + Power supply + cable. This might help you understand the role of the capacitor. I'm sorry if you already knew any of this.
Data lines are not shown, etc.
simulate this circuit – Schematic created using CircuitLab
Ok, so when the strip draws current, current goes through the cable and with the parasitic R of the cable causes a voltage drop (U = R*i, Ohm's law). The longer and/or thinner the cable, higher the parasitic R and more severe the voltage drop.
During these voltage drops, the capacitor is there as a buffer, as the guide suggests. This also means that if the parasitic R of the cable (and of the power supply) is negligible, the capacitor will make little difference. In the same way, its benefits are limited by the Parasitic R of the Strips itself (measure the voltage of a long strip, it will be far below 5V when on, you might even notice some color change).
So while it may be possible to calculate ripple vs capacitor size, I suggest you determine this empirically, if possible watching the 5V line with a scope as the LEDs light up.