Communication Between State Machines with Hidden Transitions

cembedded-systemsfailurefinite-state machinenetworking

The question emerged for me in embedded programming but I think it can be applied to quite a number of general networking situations e.g. when a communication partner fails.

Assume we have an application logic (a program) running on a computer and a gadget connected to that computer via e.g. a serial interface like RS232. The gadget has a red/green/blue LED and a button which disables the LED. The LEDs color can be driven by software commands over the serial interface and the state (red/green/blue/off) is read back and causes a reaction in the application logic. Asynchronous behaviour of the application logic with regard to the LED color down to a certain delay (depending on the execution cycle of the application) is tolerated.

What we essentially have is a resource (the LED) which can not be reserved and handled atomically by software because the (organic) user can at any time press the button to interfere/break the software attempt to switch the LED color.

Stripping this example from its physical outfit I dare to say that we have two communicating state machines A (application logic) and G (gadget) where G executes state changes unbeknownst to A (and also the other way round, but this is not significant in our example) and only A can be modified at a reasonable price. A needs to see the reaction and state of G in one piece of information which may be (slightly) outdated but not inconsistent with respect to the short time window when this information was generated on the side of G.

What I am looking for is a concise method to handle such a situation in embedded software (i.e. no layer/framework like CORBA etc. available). A programming technique which is able to map the complete behaviour of both participants on classical interfaces of a classical programming language (C in this case). To complicate matters (or rather, to generalize), a simple high frequency communication cycle of A to G and back (IOW: A is rapidly polling G) is out of focus because of technical restrictions (delay of serial com, A not always active, etc.).
What I currently see as a general solution is:

  • the application logic A as one thread of execution
  • an adapter object (proxy) PG (presenting G inside the computer), together with the serial driver as another thread
  • a communication object between the two (A and PG) which is transactionally safe to exchange

The two execution contexts (threads) on the computer may be multi-core or just interrupt driven or tasks in an RTOS.
The com object contains the following data:

  • suspected state (written by A): effectively a member of the power set of states in G (in our case: red, green, blue, off, red_or_green, red_or_blue, red_or_off…etc.)
  • command data (written by A): test_if_off, switch_to_red, switch_to_green, switch_to_blue
  • operation status (written by PG): operation_pending, success, wrong_state, link_broken
  • new state (written by PG): red, green, blue, off

The idea of the com object is that A writes whichever (set of) state it thinks G is in, together with a command. (Example: suspected state="red_or_green", command: "switch_to_blue") Notice that the commands issued by A will not work if the user has switched off the LED and A needs to know this. PG will pick up such a com object and try to send the command to G, receive its answer (or a timeout) and set the operation status and new state accordingly. A will take back the oject once it is no longer at operation_pending and can react to the outcome. The com object could be separated of course (into two objects, one for each direction) but I think it is convenient in nearly all instances to have the command close to the result.

I would like to have major flaws pointed out or hear an entirely different view on such a situation.

Best Answer

The hidden transitions and the latency make this a pretty interesting problem, especially since you can't modify the gadget. You may be looking for a synchronizing sequence or a homing sequence. ("Synchronization" means you can issue some sequence of commands, and be sure of the state of the target state machine when you're done. "Homing" means that you can issue some sequence of commands, then verify the state by examining the output of the state machine).

Sven Sandberg has a pretty good paper showing that (for any well-defined Mealy machine) you can always construct these two types of sequences. It's all theory, no code, so you'll have to do some implementation. Here's the citeseer reference to the paper itself, and a PDF slideshow of the ideas.

In the case you describe, latency is still an issue. I'm curious to know if you have some idea of the margin of error on these "hidden" state changes. How long between them? Are they more common, or more frequent, in some cases than others? Answers to these questions can help you define these control sequences.

Related Topic