Electronic – How to debug microcontoller code during run time

cortex-m0debuggingmicrocontroller

I'm using Nuc240 Microcontoller which has its UART0 configured for 'printf',I use a UART to USB controller to view the output on screen. This works during run time also, so that I could monitor things during run time after exiting debug mode.
But is this the best method or is there any other alternative methods available?
Also I use IAR Embedded workbench and the Terminal I/0 is not showing the output in 'printf', what could be the reason for this? I'm using NuLink-Pro as debugger.

Best Answer

You asked for best method, but gave no criteria for judging "best".

Using a UART at run time for monitoring and debugging is certainly one method that can have its uses. I often do that when the processor has a spare UART, or the UART is already being used for communication with a host.

However, using the monstrous printf and doing binary to ASCII conversion in the micro is usually not a good idea. It drags in a lot of code, can use significant runtime cycles, and results in high communication overhead.

It's usually better to keep things as simple as possible and in the communication procotol. Push the burdens of data conversion, presentation to the user, floating point handling, and the like to the host as much as possible. The host has essentially infinite memory and cycles for these tasks whereas the micro doesn't.

A scheme I use most of the time is to use a opcode-based binary protocol in both directions. Everything that is sent starts with a opcode byte. This is then followed by whatever data bytes are defined for that opcode.

For clarity and consistency of documentation, I call the packets sent to the embedded system "commands", and those sent from the embedded system to the host "responses". Responses aren't necessarily only sent in response to commands, but the naming distinction is useful. Commands and response each have their own opcode space.

I have a few standard commands and response I always use. Command and response 0 is always NULL, meaning it is a acceptable opcode, but does nothing and takes no data parameters. Command 1 is PING, which takes no parameters but sends the PONG response, which is also 1 with no parameters. Command 2 is FWINFO. That sends the FWINFO response (also 2), which gives the type ID, version, and sequence number of the firmware.

In the micro, I have a task that reads bytes from the UART and processes the command stream. It starts by reading the next byte as a opcode. A dispatch table is used to jump to the routine for that command. That routine reads whatever data bytes the command may have, performs the command, and goes back to the main loop that gets the next opcode byte.

Responses are sent by whatever task has something to send, whenever it has something to send. To allow multiple tasks to send responses independently, I have a mutex for the response stream. To send a response, you have to acquire the mutex, send the response bytes, and release the mutex. This allows various parts of the system to send responses asynchronously as needed, but while guaranteeing each response is sent as a whole without bytes from other responses interleaved with it.

On the host side, the response stream is handled in a separate thread. It gets each opcode byte, branches to the routine to handle that response, etc. This is usually incorporated in a test program that presents a command line interface to the remote system. The user enters commands at a prompt. Each command routine parses the command parameters and sends whatever binary response to the remote system that might be implied by that command. Since both the command and response handling parts of the program may need to write things to the user, I use a mutex around writing to standard output.

As a example, here is what happens when the user at the host enters the command to get the remote system firmware version info. The user types "FWINFO" and hits enter. The command processor parses FWINFO, looks up that keyword in a table, and branches to the routine to handle that command. That routine makes sure there isn't anything else on the command line (emits error if there is), and sends the single byte 2 to the remote system. The command processor is now done with that command, so writes out a new line and prompt, and waits for the user to enter the next command.

The command processor in the embedded system receives the opcode byte 2. This causes execution to vector to the FWINFO command processing routine. That routine acquires the response stream mutex, and sends the bytes 2, type ID, version, and sequence. It then releases the mutex and returns to the main task loop that looks for the next command opcode.

Back on the host, the responses handling thread receives the 2 response opcode byte. This causes it to branch to the FWINFO response code. That code reads the next three bytes, and saves them as the firmware type ID, version, and sequence number. It then acquires the standard output mutex, writes "Firmware is type xx version yy sequence zz", releases the mutex, and returns to the main loop looking for the next response opcode byte.

From the user's point of view, he enters "FWINFO", and the program immediately responds with "Firmware is type xx version yy sequence zz".

Note several important things that result from this architecture:

  1. The number of bytes sent over the limited-bandwidth link is minimal. For example, only a single byte was sent to request the firmware version info, and only 4 bytes were sent in return.

  2. Parsing the commands is simple in the embedded system. It looks up each opcode in a table and jumps to the routine to handle the specific command. There is no string parsing, ASCII to binary conversion, etc.

  3. The data encoding is in the most convenient format for the embedded system. Values are sent as binary bytes, just like they are already in memory.

  4. Large library routines to convert between binary and ASCII, like printf, are not dragged in to the build.

  5. Responses can be sent asynchronously. This can be useful for debugging, like you asked about. You can define some responses to be sent whenever some particular event occurs. These will be displayed by the host program on standard output as they are received.

I have sometimes created commands to enable or disable certain debug responses on the fly. You can even have a command that sets a flag so that the main event loop sends a particular response every 10 ms, for example. I've often used something like that to get all the measured analog values periodically. On the host side, these can be written to a CSV file, then displayed as a graph when the user enters a certain command.

There are many things that can be done. By starting with simple binary sequences in each direction that are asynchronous from each other, you can add all kinds of telemetry and debugging features, while still keeping the normal interactions with the system intact.