Electronic – FreeRTOS queues and IPC confusion

avrcembeddedfreertos

I'm struggling with my first real freeRTOS project. I am basically using an ATmega328P microcontroller and an nRF24L01+ radio as a "node". I have two of these nodes and I am using them to talk to each other. I have successfully ported freeRTOS to my microcontroller so that I can use tasks to blink LEDs and to do very basic send/receive of data using my radio driver which I also wrote. The part that I'm struggling with is how to best structure my code and tasks to create something like a network layer on top of my radio driver.

Here were my initial thoughts:

  1. When the radio's data ready interrupt fired, I would set a global boolean variable to signal to my "radio_task()" function to read data from the radios FIFO.
  2. I would have a function called "radio_task()", which would NOT actually be a freeRTOS task, which would check the outgoing queue and use my radio driver (in combination with freeRTOS receiveFromQueue()) to send the front packet on the queue if there is one. This function would also check the global boolean and read a packet from the radio's FIFO if there was one. It would then sendToQueue() this read packet onto the incoming queue.
  3. I would then have an actual freeRTOS called nwk_task() which would have a state machine to do various network maintenance and upkeep etc. Inside the state machine there would be some calls to a nwk_send() function which would sendToQueue() for the outgoing queue. Likewise there would have to be some receiveFromQueue() calls on the incoming queue which would process whatever packets I receive. Then lastly this freeRTOS task would call my function radio_task()
  4. I would potentially have other tasks running which would monitor sensor data and perhaps also call nwk_send() when certain events occur based on sensor data etc.

I feel like that is not a terrible approach, but I'm running into a few difficulties that I thought I would ask about. Questions:

  1. I've created a typedef for my nwk_packet_t which is a struct consisting of an address, a statically defined array which is the maximum size for a radio packet, and a data_length which says how much of the array is actually used for this packet. How do I make a freeRTOS queue which holds my nwk_packet_t 's?? Also is there a better way to make my nwk_packet_t than using the statically sized array? I thought about using a pointer but then I have to make sure the array that I'm sending isn't overwritten until it's actually pulled off the queue right?
  2. I'm confused about where blocking would/needs to occur. By using the global boolean flag in my data ready ISR I allow interrupts to occur more frequently, but does that mean I need to use portENTER_CRITICAL() when I actually push/pull data from the queues?
  3. How would a more experienced embedded programmer structure the tasks and queues to allow for a network layer on top of my basic radio driver?

Sorry for my severe lack of understanding on some of these topics, this is my first venture into OS level programming. Everything else I've done has been very low level and basic C. Thanks for your help!

EDIT: One thing I forgot to mention is that my radio has some hardware provisions for acking. You can set a message to be auto acknowledged with a certain number of retries and the hardware will handle all of that. However, a message can still fail which you can read from the radio or have the radio trigger an interrupt when this happens. So I would like to have some notion of a message success and a message failure that does some type of callback but I have no idea how to work that into my structure. Any ideas on that would fit in with removing stuff from the queues etc.?

Thanks again.

Best Answer

Comms stacks :(

Comms stacks in plain C :((

This is a summary of how I do it, though it's surely not the only way:

The app starts by by creating a 'generalPool' array of buffer structs, (BS:), of a fixed size. No more buffers are ever allocated and no buffers are ever freed during the run. The BS has space for data, data len, next/prev index bytes and a 'command' enum that describes what the buffer is, (and other stuff, but that clouds the issue). Indexes to this array are used for all inter-thread and driver comms, (I use byte-size indexes, rather than pointers, because there are less than 256 BS and I have RAM constraints). The next/prev bytes are initialized to form a double-linked list, and the calls to get/put an index are protected by a mutex.

Inter-thread comms are performed by getting a BS index from the generalPool, loading it up as required, setting the enum and then pushing the index onto a producer-consumer queue. The thread at the other end dequeues the index and, typically, switches on the enum to handle the BS message. Once handled, the consumer thread can repool the BS or queue it on somewhere else for further handling, (logger, say).

Because the BS has those next/prev bytes, the producer-consumer queue class does not need any storage space of its own - it has first and last bytes and so can link together the BS in a similar manner to the pool.

OK, now drivers:

I have interrupt-nesting disabled so that only one interrupt can run at a time. This enables me to make a BS index 'DriverQueue'. The DriverQueue has actual storage space for the index bytes - it does not use the next/prev links. This allows BS indexes to be safely added at one end, and removed at the other, by any one interrupt and one thread.

I have one 'CommsPool' DriverQueue. This is pre-filled on startup with some BS extracted from the generalPool. These BS are used for received data.

I have one 'commsTx' DriverQueue for each tx interrupt. Outgoing data is queued on them.

I have one 'commsRx' DriverQueue for all rx interrupts. Incoming data is queued on it.

One 'commsThread' handles the higher-level comms by initializing and operating a state machine, similar to your idea. When idle, it waits on a 'CommsEvent' semaphore.

The rx interrupts get BS from the CommsPool, load them up with data from the hardware, set the command enum to 'RxX', (X is the comms channel/interrupt ID number), push the BS index onto the common commsRx queue and signal CommsEvent.

The tx interrupts get BS from their own, private commsTx, load the data into the hardware, set the command enum to 'TxUsed', push the BS index onto the common commsRx queue and signal CommsEvent.

The commsThread is responsible for managing all the I/O. It has a 'commsRq' input queue for comms request BS from other threads. This is not, however a blocking queue - just thread-safe. It is not blocking because the commsThread has to handle the commsEvent signals from the interrupt-handlers as well.

Any thread that wants to communicate stuff loads up a BS with appropriate data and command, queues it to commsRq and signals CommsEvent, so waking the commsThread.

The commsthread does not know why it has been woken, so it polls the commsRx queue first to see if there is a BS in it. If there is, it handles it - if an 'RxX', it processes it through its state-engine code/data, if a 'TxUsed', it checks the CommsPool first, to see if it needs 'topping up', and pushes it there if there is need, else it pushes it back onto the generalPool for re-use elsewhere.

Once the commsThread has handled the driver queues appropriately, it polls the commsRq queue to see if there are any new comms requests from other threads. If there are, it dequeues and handles the request thorough it's state-machine code/data.

After that, the commsThread checks again to see if any CommsPool 'topping up' is required and, if the CommsPool is not full, tops it off with more BS from the generalPool.

The commsThread then loops back to wait on the semaphore again. The semaphore ensures that the commsThread runs exactly as many times as are requried to handle all input from other threads and the interrupt-handlers, no more, no less. If the thread ever wakes up and finds nothing to do, it's an error.

That's how I do it, anyway:) It provides good throughput and efficient use of RAM. Inter-thread producer-consumer queues need no internal storage. Only one thread, (and so only one RAM-consuming stack:), is required for all interrupt-management and Tx/Rx data handling. No mallocs/frees required after initialization. There is no busy-waiting or any need for periodic checking of any flags. No copying of the data is required, (except in/out of hardware - unavoidable). Timeout actions can be handled by either a timed wait on the semaphore, (preferable, if your OS supports it), or by the periodic'injection' of a 'TimeTick' BS on the inputQueue from some other thread. Returned BS can easily be 'diverted' to, say, a logger or terminal, for debug display before returning them to the generalPool.

However you do this, you should consider moving to C++. C just gets messy for anything other than simple straight-line code. C++ allows, for instance the BS to be implemented as class instances with methods for streaming in data and for 'auto-extending' a BS by getting and linking another BS if one BS gets full, so generating a 'compound' data message.

I've left some stuff out. For example, perhaps you already know the misery of tx interrupts - after the tx has been idle, they often have to be 'primed' by having the first bytes loaded into a FIFO to get the TX interrupt to start again :(

Also hint: my UART debug terminal prompt looks like 'A:96>'. The number, (96 here), is the current count of BS in the general pool. If this number starts dropping, I know I have a leak:)