Imagine a capacitor with an existing, stable, unchanging voltage across it. It might be a DC power supply placed across the capacitor, for example, where it's been a long time and the capacitor has "charged up." In this case, there is no current because... well... there's no need for any. The circuit has reached equilibrium. It just sits there.
Now, you turn a knob and the DC power supply changes its voltage. The capacitor must also change, too. (You can't have a power supply with one voltage and a capacitor with a different voltage when they are tied together like this.) But it can't change instantly because the capacitor is a large reservoir of charge, in effect, and to change its voltage you must change that reservoir's "level of charge." To change that, you have to supply (or remove) some charge. But moving charge requires time and together, charge motion and time, you must have current to get there.
So, if you change the voltage then that must stimulate some charge to flow onto, or off of, the capacitor. If you change the voltage slowly, then the rate of change of charge in the capacitor's reservoir over time is less. If you change the voltage rapidly, then the rate of change of charge in the capacitor's reservoir must be more. To achieve a faster rate of change of voltage across the capacitor, you must supply a higher current in order to fill (or drain) the capacitor's storage of charge.
The voltage across a capacitor is: \$V=\frac{Q}{C}\$. So with the capacitance held fixed, to get a higher voltage \$V\$ you need more charge \$Q\$.
Now, when you look at the equation:
$$I = C\cdot \frac{\textrm{d}\:V}{\textrm{d}\:t}$$
You can see all that wonderful hand-waving tied up in a package with nice bow. The current has to be larger if the capacitor has a larger capacitance. Why? Because it is a bigger reservoir and it needs more charge to achieve the same voltage. The current has to be larger also if the rate of change of voltage is more. For reasons just discussed above. That equation puts it all in one place.
Now, what does this mean regarding lagging or leading currents and voltages? Well, take a look at a sine wave centered on \$y=0\$ with voltage on the \$y\$ axis. Then tell me at what value of \$y\$ is the rate of change of the sine wave at its most rapid. It will be when the voltage is itself at zero. In other words, the current into or out of the capacitor will have to be at its maximum value when the voltage across the capacitor is itself at zero (for the sine wave case, anyway.)
The only thing left to worry about is lagging vs leading, and which is which in the case of the capacitor. This is just a matter of sign. In the case of a capacitor, when the voltage is rapidly rising away from zero in the positive going direction, the conventional current into one side of the capacitor is also conventional current away from the other side. You want to look at this as a current "through" the capacitor, even though physical charges don't actually leap through the insulator of the capacitor. So the sign is taken as positive as the above equation suggests.
Now go back and look at those curves you mentioned. You will see that the shape of the current (which obeys all of the above discussion) through the capacitor will "look like" it is \$90^\circ\$ earlier than the voltage curve. You could also claim that it is \$270^\circ\$ later. But to keep things simple everything should be seen as \$-180^\circ \lt \theta \lt 180^\circ\$. (The special case of \$\theta = 180^\circ=-180^\circ\$ is reserved a special term: antiphase.)
So the current is said to "lead" voltage in a capacitor. Or else voltage is said to "lag" current in a capacitor. Either way means the same thing. That it happens to do so by exactly \$90^\circ\$ is only true when you aren't taking into account other "parasitics" such as Ohmic resistance in the wires. Resistors develop a voltage drop across them in strict accordance with the current through them. So when the voltage change attempts to cause a current to flow, the resistor immediately opposes this by developing a voltage drop across it hindering the voltage that is actually then applied to the capacitor. And this fact shifts the lead/lag calculation so that it is no longer \$90^\circ\$, anymore.
There's a lot of algebraic tools of the trade you learn to simplify the work you have to do, just like learning to perform long-hand multiplication is a trick that helps you multiply big numbers without having to perform lots and lots of additions, over and over. These tricks are based upon good theoretical ideas. But if you just learn them and apply them without really understanding where they come from, they will still work for you. Just like you don't need to understand why long-hand multiplication works in order to use it.
The capacitor is easier to describe because it works with attributes that are easier to imagine. We can count them. They are units of charge. An inductor works with equivalent units of charge, but as a matter of magnetism. These units are in Webers, instead of charge. And it's hard to "imagine" a Weber and count them (these are "volt-seconds," or \$\int V_t\:\textrm{d}t\$), for us normal folks. Some people have no problem. Others do. But these are a symmetrical unit for charge.
I'll do a short derivation to explain why, taken from an energy perspective (if there is a founding bedrock principle in physics it is the conservation of energy.) Let's follow here:
$$\begin{split}
W &= \frac{1}{2}\:C\: V^2\\\\
\textrm{d} W &= C\: V\:\textrm{d}V + \frac{1}{2}\: V^2\:\textrm{d}C\\\\
\frac{\textrm{d} W}{V} &= C\: \textrm{d}V + \frac{1}{2}\: V\:\textrm{d}C
\end{split}
\quad\leftrightarrow\quad
\begin{split}
W &= \frac{1}{2}\:L\: I^2\\\\
\textrm{d} W &= L\: I\:\textrm{d}I + \frac{1}{2}\: I^2\:\textrm{d}L\\\\
\frac{\textrm{d} W}{I} &= L\: \textrm{d}I + \frac{1}{2}\: I\:\textrm{d}L
\end{split}$$
noting,
$$\textrm{where } I=\frac{\textrm{d}Q}{\textrm{d}t}\textrm{ and } V=\frac{\textrm{d}W}{\textrm{d}Q}\textrm{ and d}L=0 \textrm{ and d}C=0$$
resulting in,
$$\begin{split}
\frac{\textrm{d} W}{\frac{\textrm{d}W}{\textrm{d}Q}} &= C\: \textrm{d}V \\\\
\frac{\textrm{d} W}{\textrm{d}W}\textrm{d}Q &= C\: \textrm{d}V \\\\
\textrm{d} Q &= C\: \textrm{d}V\\\\
\int\textrm{d} Q &= \int C\: \textrm{d}V\\\\
Q &= C\: V
\end{split}
\quad\leftrightarrow\quad
\begin{split}
\frac{\textrm{d} W}{\frac{\textrm{d}Q}{\textrm{d}t}}&= L\: \textrm{d}I\\\\
\frac{\textrm{d} W}{\textrm{d}Q}\textrm{d}t &= L\: \textrm{d}I\\\\
V\textrm{d}t &= L\: \textrm{d}I\\\\
\int V\textrm{d}t &= \int L\: \textrm{d}I\\\\
V\:t &= L\: I
\end{split}$$
And there you are. Countable things on the left. But weird units on the right. Volt-seconds (Webers) are to inductors as Coulombs of charge are to capacitors.
Another bit of slight of hand from the above can be had, as well:
$$\begin{split}
\textrm{d} Q &= C\: \textrm{d}V\\\\
\frac{\textrm{d} Q}{\textrm{d} t} &= C\: \frac{\textrm{d}V}{\textrm{d} t}\\\\
I &= C\: \frac{\textrm{d}V}{\textrm{d} t}
\end{split}
\quad\leftrightarrow\quad
\begin{split}
V\textrm{d}t &= L\: \textrm{d}I\\\\
\frac{V\textrm{d}t}{\textrm{d} t} &= L\: \frac{\textrm{d}I}{\textrm{d} t}\\\\
V &= L\: \frac{\textrm{d}I}{\textrm{d} t}
\end{split}$$
Sorry about the diversion. But I thought it may help some (and you?)
Best Answer
Let's start by pointing out that circuit 1 is the same as circuit 2 and that measuring an output across the resistor will look like a high pass filter whereas measuring it across the capacitor will look like a low pass filter. The filter will have a transitional point (or 3 dB point or half power point) at: -
f = \$\dfrac{1}{2\pi RC}\$
If you measure across the resistor and the capacitor at this transitional frequency you'll see that the RMS voltage is the same at: -
\$\dfrac{V_{IN}}{\sqrt2}\$
You'll also find that |Xc| = R at this frequency and that the phase shift of the signal across the resistor is exactly 90 degrees from the voltage across the capacitor. If you messed around a bit more you will find that this is always the situation because the current through both components is the same. If current is the same then the voltage across C always lags the voltage across R by 90 degrees.
It's also true that at very high frequencies Xc is very much smaller than R so the voltage across R closely resembles the input voltage whereas the voltage across C will be very small. At very low frequencies the opposite is true - the voltage across C is very similar to the voltage on the input and the voltage across R is tiny.
Only at these extremes will the phase shift produce close to 90 degrees and that will be seen across the component that has the lowest impedance.