Odd vs Even Parity
This will depend slightly on what communication you are using. I know you said you are using UART, but I am going to answer a bit broader. If you do decide to go with parity, select the option that will cause your "idle" state to require the parity bit to switch.
If you have an active high system, then all 0's is idle. So make it odd parity so that the parity bit will have to change state from idle.
In an active low system you should look at how many bits the parity will be over. If it is 8 bits, then even parity will result in a 0 parity bit, which follows the idea of forcing a change in state for the parity.
Should you use Parity?
Well this is a bit of a difficult questions. In general we like to model the noise as Gaussian noise which means that the bit errors will be completely random. In actuality noise that has an effect on our system is not always random. The reason for this is things that can cause errors on a PCB are radiators from something else. If you think about it, in order for a trace that short to have enough noise to cause bit error then it had to be something rather extreme. When you have a noise source like this, there is a decent chance that you will flip more then one bit. Parity is useless against an even number of bit errors. With out diving into the math, parity will help, but doesn't help tons. If you can't afford to do much processing then parity may be the best that you can do..
Why use a CRC?
First of all, you say you have a built in CRC generator, this means it should be very easy for you to compute. CRCs are much better at catching multiple bit errors. In an environment where you want a very low chance of getting any errors, you really want to use a CRC. In fact, go with the biggest CRC you can afford in your system. One little trick that I know works for atleast crc16 if not others is if you CRC the message you received with the CRC on it you should get 0 as your answer. If you have hardware to compute the CRC then this is a very efficient way of both generating CRC and checking CRC.
You questioned as to whether you would ever need to use the "hold" state of the shift register. In your case, since you are using a bit-banged approach to operate the shift register you have the ability to selectively inhibit the clocking going to the register. Under this condition there will really be no need to use the "hold" state because the same thing can be attained by holding the clock pin at a constant level (either high or low).
The real use for the "hold" state could come into play on cases where the clock is a constantly running signal and hardware logic is implemented to produce LOAD and ENABLE pulses to the shift register to permit operation at the desired times. Such logic would have to produce the pulses in proper synchronism to the clock signal to ensure that setup and hold times around the rising edge of the clock are met so that the shift register operates correctly.
Best Answer
I would clock the byte into a Parity Generator, then clock the output of that into a 9 bit Parallel-In-Serial-Out shift register, and shift away...
Sure, it's two chips, but it's better than a programmable part.
74HC280 and SY10E142
Or equivalents.