The 33V TVS isn't good enough. Rated reverse standoff voltage is always lower than breakdown voltage. For instance the Littelfuse 1.5KE39A is rated at 33V, but breakdown voltage can be as high as 41V. The LM78L05's absolute maximum input voltage is only 35V. A TVS is still a good idea though, since you're working in an automotive environment. I'll get back to it in a minute.
Russell suggests using a series resistor, and I concur. The resistor will drop the input voltage and form a low-pass filter with the LM78L05's input capacitor. I would even go a step further, and also place a zener diode on the input, so that you get a shunt pre-regulator. This will cause a little higher current consumption, though, but with a 1k\$\Omega\$ series resistor and a 12V zener this will only be 6mA.
You can then safely use a 1.5KE20A TVS, rated at 17V and with the maximum breakdown voltage of 21V your regulator will be safe.
Then the capacitors. If TI says a 330nF is required at the input, for pete's sake, put it there! I would also add a 10 to 33\$\mu\$F electrolytic; the zener is a voltage regulator, and all regulated voltages need a buffer capacitor.
PS: unless the datasheet mentions a minimum dV/dt for the input voltage (it doesn't) there's no upper limit for the input capacitor. Go for that terafarad cap if you feel like it.
further reading
Application Notes for Transient Voltage Suppressors from On Semiconductors.
I think that the hold time for the MLCC can be calculated numerically as in the following example. The total charge in a linear capacitor Q is C times V. But MLCC is not a linear capacitor and therefore Q=f(V) (some function that we will assume known now).
At time 0, let be V=5V. At this voltage Q0=f(5)=240 uC.
After some unknown small time step, the voltage dropped to 4.9 V. The charge in the capacitor is now Q1=f(4.9)=237.65 uC. (for example).
Assuming a constant current sink I of 10 mA and remembering that I·(delta time)=delta Q. We can calculate delta time=(240-237.65 uC)/(10 mA)=0.235 ms. The first time step took 0.235 ms.
After the following time step, the voltage dropped to 4.8 V. The new charge will be Q2=f(4.8)=235.2 uC. This time step is then (237.65-235.2)/10 mA=0.245 ms.
If this is continued until the voltage arrived to the minimum allowable voltage for your circuit, you only need to add all the time steps to get the hold time.
I chose voltage steps of 0.1 V, but values smaller or bigger can be chosen to get more or less accuracy in the final result. The problem remains to find function f(V).
The capacitance values from the "Capacitance vs DC Bias" graph in the datasheet gives the the relationship between Delta_Q and Delta_V at every DC bias voltage; i.e. it gives the capacitance seen by a small signal.
I think that a good approximation of f(V) could be obtained doing Integral(from 0 to V, of C(V')·dV'). Where C(V') is read from the "Capacitance vs DC Bias" graph.
Finally there is a FAQ from Murata http://www.murata.com/en-global/support/faqs/products/capacitor/mlcc/char/… where the physics behind the capacitance change are explained:
Without a DC voltage, spontaneous polarization can happen freely. However, when a DC voltage is externally applied, spontaneous polarization is tied to the direction of the electric field in the dielectric, and independent reversal of spontaneous polarization is inhibited. As a result, the capacitance becomes lower than before applying the bias.
This explanation would also apply to decreasing DC voltages. If DC voltage slowly decreases (during capacitor discharge) the polarization won't be tied to a particular direction and then the capacitance will increase.
The calculation of the hold-up time, using this method, can be done rather easily with Excel. I attach a worksheet with real datasheet data for a given 47 uF MLCC capacitor and the necessary calculations:
Hold-up time comparison betwween given 47 uF MLCC capacitor and a 40 uF linear one
Best Answer
Mainly, because every chip can't be right next to the regulator. The further your chip is from the regulator that's supplying it, the more resistance and inductance there is in the connection from the regulator to the Vcc pin (and from the ground pin on the way back).
If the current draw of your chip changes, this resistance and inductance will result in a change in the voltage at the Vcc pin.
There's two ways to look at this.
When your chip changes its current draw, that di/dt will create a voltage drop across the inductance back to the voltage source. You want a capacitor that can supply (or sink) the current delta until the current from the source can respond.
Unfortunately choosing a capacitor this way requires knowing two things that you often don't know: What will be the di/dt generated by the chip (this one you might actually know in some cases), and what's the inductance of the connection to the source (this you could simulate with a good power integrity tool, but that's expensive).
You can design your bypass capacitors to provide a low-impedance connection to ground at all frequencies you're interested in.
A low valued capacitor will have a high impedance at low frequencies because \$Z=\dfrac{1}{j\omega{}C}\$.
A high-valued capacitor will require a larger package and have a high impedance at high frequencies because of its equivalent series inductance (ESL), for which \$Z=j\omega{}L\$.
The solution is to put several values of capacitor in parallel, so that all frequencies are covered. A good capacitor vendor will provide ESL and ESR characteristics so that you can simulate your combination of capacitors and find a combination that works.
A common set-up is a 0.1 uF ceramic capacitor at the Vcc pin of each chip, and a few large-valued electrolytics spread around the board (not necessarily one per chip). Whether this is appropriate for your design isn't clear from what you've shared.
Generally the high values (in bigger packages and often electrolytics) don't need to be as close to the chip as the small-value (small package) capacitors, because they are useful at lower frequencies where inductance separating them from the load (chip) has less effect. Maybe one 10 uF capacitor can be shared between 4 or more loads. And a few 47 or 100 uF capacitors can be sprinkled around the board.