I don't know this specific MCU, but the question is fairly generic.
The obvious answer to all problems like these is to keep the ISRs as slim as possible. At most, they should stuff data into a ring buffer, which is later processed by the main program. (A MCU with DMA would have been even better, but I don't think you have DMA on MSP430?)
If the ISR is still too slow after such optimizations, then you have no other option but to do as you suggest: enable the global interrupt mask at the top of the UART ISR and let the higher priority interrupt take precedence. Be aware however, that when you allow more interrupts to come on top of an already executing ISR, you allow more stack depth.
Do I also need to unmask the serial RX interrupt at the end?
I assume that you have to do that from the ISR no matter the nature of the application? That's how most MCUs work. And yes, if you touch the global interrupt mask, you will have to clear the specific interrupt after you are done serving it, i.e. after you have copied the received data into local variables.
After the timer ISR returns, will the serial ISR continue, and then return to the main loop?
If you have changed the global interrupt mask, then yes.
Are there any potential race conditions I need to consider?
You always have to consider such whenever sharing data between an ISR and something else. It doesn't matter if the ISR only writes and the main program only reads etc, unless you can guarantee that each access is atomic, which you usually can't unless you write the code in assembler.
In a high level language, you may have to use some semaphore variable. How this is implemented is application-dependent. It particularly depends on if you can afford to miss out some data from the ISR or if you have to catch all data.
Is there a better solution?
DMA or a multi-core MCU.
Best Answer
Your XMEGA frequency is 32MHz, with line TCC0.CTRLA = TC_CLKSEL_DIV1024_gc you've selected prescaler to 1024, you need to divide the core frequency 32000000/1024 = 31250 = 31 KHz, then find a period of that 31KHz, which is 32us and multiply it with number of cycles function needs to be completed (2323) and after that you'll getting time you need, which is 74ms.