Electronic – Why not always use DMA in favor of interrupts with UART on STM32?


I spend last month a lot of time getting UART (for MIDI) to work with an STM (STM32F103C8T6) using interrupts, without very much success.

However, this evening using DMA it worked quite fast.

Since as far as I read DMA is faster and relieves the CPU, why not always use DMA in favor of interrupts? Especially since on the STM32 there seem to be quite some problems.

I'm using STM32CubeMx/HAL.

Best Answer

While DMA relieves the CPU and thus may reduce latency of other interrupt-driven applications running on the same core, there are costs associated with it:

  • There is only a limited amount of DMA channels and there are limitations on how those channels can interact with the different peripherals. Another peripheral on the same channel may be more suited for DMA use.

    For example, if you have a bulk I2C transfer every 5ms, this seems like a better candidate for DMA than an occasional debug command arriving on UART2.

  • Setting up and maintaining DMA is a cost by itself. (Normally, setting up DMA is considered more complex than setting up normal per-character interrupt-driven transfer, due to memory management, more peripherals involved, DMA using interrupts itself and the possibility that you need to parse the first few characters outside of DMA anyways, see below.)

  • DMA may use additional power, since it is yet-another-domain of the core which needs to be clocked. On the other hand, you can suspend the CPU while the DMA transfer is in progress, if the core supports that.

  • DMA requires memory buffers to work with (unless you are doing peripheral-to-peripherial DMA), so there is some memory cost associated with it.

    (The memory cost may also be there when using per-character interrupts, but it may also me much smaller or vanish at all if the messages are interpreted right away inside the interrupt.)

  • DMA produces a latency because the CPU only gets notified when the transfer is complete/half complete (see the other answers).

  • Except when streaming data into/from a ring buffer, you need to know in advance how much data you will be receiving/sending.

    • This may mean that it’s needed to process the first characters of a message using per-character interrupts: for example, when interfacing with an XBee, you’d first read packet type and size and then trigger a DMA transfer into an allocated buffer.

    • For other protocols, this may not be possible at all, if they only use end-of-message delimiters: for example, text-based protocols which use '\n' as delimiter. (Unless the DMA peripheral supports matching on a character.)

As you can see, there are a lot of trade-offs to consider here. Some are related to hardware limitations (number of channels, conflicts with other peripherals, matching on characters), some are based on the protocol used (delimiters, known length, memory buffers).

To add some anecdotal evidence, I have faced all of these trade-offs in a hobby project which used many different peripherals with very different protocols. There were some trade-offs to make, mostly based on the question "how much data am I transferring and how often am I going to do that?". This essentially gives you a rough estimate on the impact of simple interrupt-driven transfer on the CPU. I thus gave priority to the aforementioned I2C transfer every 5ms over the UART transfer every few seconds which used the same DMA channel. Another UART transfer happening more often and with more data on the other hand got priority over another I2C transfer which happens more rarely. It’s all trade-offs.

Of course, using DMA also has advantages, but that’s not what you asked for.