UML Statecharts Formal Verification

Formal verification can be applied to UML statecharts. Aurelian Melinte demonstrates how to model statecharts in Promela.

How difficult is it to model an UML statechart in Promela [Spin] for formal verification? There seems to be very little information on how to do it.

Some notes about the Promela code

At its core, Promela is a modelling language, not a programming one. Even though it looks like C, the differences are significant. As such, you might find it lacks the constructs you would normally expect as a programmer and the goto, define and inline (with a very specific semantics) are ruling the field.

There are a few keywords that are not what you usually expect:

process: a process is Promela’s idea of modelling a parallel execution task. It is not an operating system process. A process executes a special proctype function. Processes exchange messages via channels (chan artefacts) and can reference global data.
state: Promela turns both the model you describe with the language, and the linear temporal language formula (LTL) describing the desired behaviour of the model in time, into Büchi automatons [Wikipedia] – these automatons’ states are the ones that Promela will report about. To avoid confusion with the states of the statechart, I will explicitly name these as UML-states. Plain ‘states’ are Promela/Büchi ones.
timeout: Promela has no notion of measurable time (though you can force one in). timeout is a variable that is set true when the verification has not yet run to completion yet the Büchi automaton does not know which state to transition to next and the verification has to stop. It could be a genuine timeout, a dead lock or something else, depending on what the model is about. Thus, you can check for timeout conditions in the model and act on accordingly.

It is important to note that statements are generally blocking statements: the process is stopped at a statement until the statement become executable. Some statements are unconditionally executable (think printf or assert and plain variable assignments) but most of them become executable only when meeting the appropriate condition (e.g. var == true becomes executable only when var is set to true; until this happens, the process trying to execute that statement is awaiting on the condition).

The unusual looking if construct is more akin to a C switch statement. It is a selection statement awaiting for statements to become executable. An :: option sequence starts executing when its first statement (the ‘guard’) becomes executable. And if more than one option becomes executable, the verification will explore all these paths ‘non-deterministically’. If none is executable, the selection blocks until at least one option becomes executable.

The curious end labels you see in the models are there to inform the verification engine that it is normal for that process to end the verification in that particular state. Otherwise, the verification will report it as an error.

Side note: do use the iSpin GUI coming with spin; or some other GUI. The insights offered into the model are really worth it.

Statechart modelling notes

There are (at least) two things to pay special attention to when modelling:

run to completion (RTC): an event entering the statechart should be fully processed before a next event is considered for processing. This is key, as having multiple events propagating quasi-simultaneously through the statechart will end up with the wrong events processed by the wrong UML-states. The side effects of an event propagating through will start being visible as the event processing is progressing and but this is good: the statechart, supposedly, lives in a multi-threaded environment and you want to check for race conditions that the actions on enter/exit/transitions are asynchronously effecting changes on shared resources.
transition execution order: the canonical order when transitioning from UML-state A to UML-state B is: execute on-exit actions of A then execute the actions associated with the transition itself then execute the on-entry actions of B. Some statechart implementations [QP] might have a different execution order (e.g. transition then on-exit A then on-exit B). You may want to model for the specifics of the statechart engine that the model will be implemented in.

A finite state machine (FSM)

Modelling a plain FSM should be a rather straight-forward task: one process for the whole statechart should suffice. Figure 1 is one easy FSM to model: a double-switch light: the light is on only when both a wall switch and an on-lamp switch are turned on. Listing 1 shows the relevant parts of the model [Melinte-1]. goto aside, it is quite readable and a rather direct translation of the UML chart.

Figure 1

#define idx_unknown -1
#define idx_state_BothOff 0
#define idx_state_LampOff 1
#define idx_state_LightOn 2
#define idx_state_WallOff 3

mtype = { event_LampSwitch, event_WallSwitch }

typedef event {mtype evId};
chan _stateMachineChannel = [1] of {event};
bool _stateMachineReady = false;
bool _isLightOn = false;

inline send_event(evt)
{
  local event evtSend;
  evtSend.evId = evt;
  _stateMachineChannel!evtSend;
}

proctype Switch()
{
  xr _stateMachineChannel; 
  local event evtRecv; 
  local short currentState = idx_unknown; 

  /* initial state idx_state_BothOff[*/
entry_BothOff: //0
  /* execute on-entry BothOff actions if any */
  currentState = idx_state_BothOff;
  _stateMachineReady = true;
end_Switch:
body_BothOff:
  if
  :: (evtRecv.evId == event_LampSwitch) -> 
    /* execute on exit state BothOff then
       transition actions*/
    goto entry_WallOff;
  :: (evtRecv.evId == event_WallSwitch) -> 
    /* on exit BothOff & 
       transition actions here */
    goto entry_LampOff;
  fi
/*]state idx_state_BothOff*/

/* state idx_state_LampOff[*/
...
/*]state idx_state_LampOff*/

/* state idx_state_LightOn[*/
entry_LightOn: //2
  _isLightOn = true;
  currentState = idx_state_LightOn;

body_LightOn:
  if
  :: (evtRecv.evId == event_LampSwitch) -> 
    _isLightOn = false;
    goto entry_LampOff;
  :: (evtRecv.evId == event_WallSwitch) -> 
    _isLightOn = false;
    goto entry_WallOff;
  fi
/*]state idx_state_LightOn*/

/* state idx_state_WallOff[*/
...
/*]state idx_state_WallOff*/
} // Switch

Listing 1

The statechart gets its events from a model-global _stateMachineChannel channel. This channel needs only a one-message capacity as the statechart processes events one at a time. There is a _stateMachineReady flag to let the events flow into the statechart once it has reached the proper initial state. And there is an _isLightOn ghost variable cum-lightbulb for verification purposes.

To verify it, Promela needs a ‘closed environment’: that is, we need to add code to exercise the statechart – a family of test scenarios. We also need to tell Promela what the expected behavior is for the given test scenarios: an LTL formula to verify. These are in Listing 2.

proctype TestEnvironment() provided
  (_stateMachineReady)
{
  assert(Switch:currentState ==
    idx_state_BothOff);
  if
  :: true -> 
    send_event(event_WallSwitch);
    send_event(event_LampSwitch);
  :: true -> 
    send_event(event_LampSwitch);
    send_event(event_WallSwitch);
  fi
  (_isLightOn == true);
  assert(Switch:currentState ==
    idx_state_LightOn);

  if
  :: true -> 
     send_event(event_LampSwitch);
     send_event(event_WallSwitch);
  :: true -> 
     send_event(event_WallSwitch);
     send_event(event_LampSwitch);
  fi
  (_isLightOn == false);
  (Switch:currentState == idx_state_BothOff);
} // TestEnvironment

ltl { always/*[]*/ eventually/*<>*/
    (             (Switch:currentState == 
      idx_state_BothOff && _isLightOn == false) 
    implies/*->*/ (Switch:currentState == 
      idx_state_LightOn && _isLightOn == true) 
    implies/*->*/ (Switch:currentState == 
      idx_state_BothOff && _isLightOn == false)
)}

Listing 2

The verification is only as useful as the closed environment is elaborate enough to match the reality that the statechart will be subjected to in its real implementation. And only as useful as the LTL formula expresses the expected behavior. One other process will be environment injecting the events into the statechart. The TestEnvironment flips one switch then another to get the light on then flips both again to get back to the initial state. It should start injecting events only when the provided clause holds true, which happens when the statechart reached its initial state.

Hence TestEnvironment offers four possible execution paths – four scenarios – to verify. In all four cases, the LTL should hold true. The LTL can be expressed with plain-English operators or more cryptic ones – both possibilities are shown here.

A hierarchical state machine

Now onto the main dish: how to model HSMs. There are at least two approaches.

Flattening

One approach could be flattening. Theoretically, any HSM can be ‘flattened’ into an equivalent FSM by hoisting the transitions table from a substate into the composite states owning it. Start with the lowest leaf states – flattening will eliminate these much like all four UML-states are expressed into the one FSM Switch above. Repeat until you reach the top. If you have to do it manually (I know of no tool to do it for Promela), I doubt it can work with sizeable statecharts, given the resulting goto ping-pong code. Good luck keeping the Promela model true to its UML spec.

A process per UML-state

Another approach is to use one process per UML-state and a maze of channels to move events between these. Despite the growing complexity of the model, the translation of the UML spec is rather mechanical, and it keeps the transition tables local to each process-cum-UML-state. It can scale, at least as a mental effort. We can think of the double-switch as if being an HSM and model it accordingly: the switch itself can be looked at as a composite UML-state with four substates. See the relevant snippets of this model [Melinte-2] in Listing 3 (supporting code), Listing 4 (a sample state implementation out of four) and Listing 5 (switch HSM) for the model; and Listing 6 for the closed environment.

mtype = { event_ExitState, event_EnterState,
          event_LampSwitch, event_WallSwitch, }
typedef event {mtype evId; short toState};

chan  _stateMachineChannel = [1] of {event};
chan  _stateMachineInternalChannel = [3] of 
       {event};
short _currentState = idx_unknown;
bool  _stateMachineReady = false;
bool  _isLightOn = false;

inline send_internal_event(evt, toState) 
  // sorted order
{
  _stateMachineInternalChannel!evt(toState);
}

inline send_event(evt, toState)
{
  empty(_stateMachineInternalChannel);
  _stateMachineChannel!evt(toState);
}

Listing 3

proctype StateLightOn(chan superChannel; 
                      chan eventProcessedChan)
{
  local event evtRecv; 

entry_LightOn: 
  _currentState = idx_state_LightOn; //2
  _isLightOn = true;

body_LightOn:
  superChannel?evtRecv; 
  // Send event to substates/regions for 
  // processing (none here). If substates did not 
  // processed the event: attempt to process the 
  // event per our transition table below

  atomic {
    if
      :: (evtRecv.evId == event_ExitState && 
         evtRecv.toState == idx_state_LightOn) -> 
      eventProcessedChan!true;
      goto exit_LightOn;

      :: (evtRecv.evId == event_LampSwitch) -> 
      send_internal_event(event_EnterState, 
                          idx_state_LampOff);
      eventProcessedChan!true;
      goto exit_LightOn;

      :: (evtRecv.evId == event_WallSwitch) -> 
      send_internal_event(event_EnterState, 
                          idx_state_WallOff);
      eventProcessedChan!true;
      goto exit_LightOn;

      :: else -> assert(false); 
      skip;
    fi

    goto body_LightOn;
  } // atomic

exit_LightOn:
  _isLightOn = false;
}

Listing 4

proctype Switch(chan superChannel)
{
  local event evtRecv; 
  chan substateChannel = [1] of {event};
    
  chan eventProcessedChannel = [0] of {bool}
  bool eventProcessed = false;

entry_Switch:
  send_internal_event(event_EnterState, 
    idx_state_BothOff); // enter initial state

body_Switch:
  if
  :: nempty(_stateMachineInternalChannel) -> 
     _stateMachineInternalChannel?evtRecv
  :: empty(_stateMachineInternalChannel)  -> 
     end_Switch: superChannel?evtRecv;
  fi
      
  atomic {
    if
      :: (evtRecv.evId == event_EnterState &&
         evtRecv.toState == idx_state_BothOff) -> 
      run StateBothOff(substateChannel,
                       eventProcessedChannel);
      goto body_Switch;

      :: (evtRecv.evId == event_EnterState &&
         evtRecv.toState == idx_state_LampOff) -> 
      run StateLampOff(substateChannel,
                       eventProcessedChannel);
      goto body_Switch;

      :: (evtRecv.evId == event_EnterState &&
         evtRecv.toState == idx_state_LightOn) -> 
      run StateLightOn(substateChannel,
                       eventProcessedChannel);
      goto body_Switch;

      :: (evtRecv.evId == event_EnterState &&
         evtRecv.toState == idx_state_WallOff) -> 
      run StateWallOff(substateChannel,
                       eventProcessedChannel);
      goto body_Switch;

      :: (evtRecv.evId == event_ExitState) -> 
      substateChannel!evtRecv;
      eventProcessedChannel?eventProcessed;
      goto body_Switch;

      :: else -> skip; // send to substates 
                       // for processing
      fi
      // Send event to substates/regions for
      // processing
      substateChannel!evtRecv.evId(_currentState);
      eventProcessedChannel?eventProcessed;
      if
      :: (eventProcessed == true) -> 
         goto body_Switch;
      :: else -> skip; // to the transition table 
                       // next
      fi
      // Attempt to process the event per our
      // transition table which is empty
      assert(false);
      goto body_Switch; // next event?
    } // atomic
exit_Switch:
} // Switch

Listing 5

proctype TestEnvironment() 
{
  (_stateMachineReady == true);
  assert(_currentState == idx_state_BothOff);
  
  // unchanged code
  ...
  
  (_isLightOn == true);
  assert(_currentState == idx_state_LightOn);
  
  // unchanged code
  ...

  (_isLightOn == false);
  (_currentState == idx_state_BothOff);
} // TestEnvironment

ltl {[] <>
  (  (_currentState == 
      idx_state_BothOff && _isLightOn == false) 
  -> (_currentState == 
      idx_state_LightOn && _isLightOn == true) 
  -> (_currentState == 
      idx_state_BothOff && _isLightOn == false)
)}

Listing 6

A substate awaits events from its superstate via a dedicated superChannel. If the event is about entering one of its own substates, it spawns the appropriate process. If not, it first passes the event down to its own currently executing sub-substate and awaits the result of. The sub-substate reports back whether it processed or passed on the event via an eventProcessedChannel rendez-vous (a zero-messages capacity) channel. If the event was passed on, the substate checks its own transition table and acts on the event or passes it accordingly. Finally, it informs its superstate of the processing result in the same vein it was informed of the processing status by the sub-substate.

There are a few changes to take notice of:

The atomic wrapping a bunch of statements are there to reduce the number of verification states (it works: for this particular model the reduction is about two thirds less Büchi states)
There is now a _stateMachineInternalChannel This is needed to accommodate the RTC requirement: this channel will accumulate event_EnterState and event_ExitState as the HSM moves from one UML-state A to UML-state B. Manually determine the least common ancestor (LCA) UML-state of A and B and queue exit events then enter events into the internal channel as you traverse from A to B. The state machine will first process internal events (if any) then await for an external event. The capacity of the stateMachineInternalChannel is twice of the longest LCA path plus one for the transition itself.
The zero-sized eventProcessedChannel channel that is local to each UML-state that is a composite state is a rendez-vous channel for the substates to report back to their superstate how an event that reached into that substate was processed (or passed on). The superstate will block on it until the substate is done processing, thus ensuring RTC.
_currentState is now a global. TestEnvironment and the LTL have been changed accordingly. We avoid remote references to processes’ internal variables for reasons that will get explained below.

And here is a slightly modified test scenario and LTL formula in Listing 6. Of note, the provided clause has been replaced with a plain statement: (_stateMachineReady == true);. The change is explained below.

As stated above, TestEnvironment and the LTL are now referencing the global _currentState.

Complexity and runtime notes

The FSM model has less than 300 states and the verification is done in a few jiffies. The HSM has less than 4000 Büchi states and the verification runs its course still as fast. The models are too small to infer anything significant but, while the statechart model is rather fixed in size, the environment counterpart can be made really complex. For instance we can add a loop to inject 10 times or 100 times more events in the statechart:

HSM Model Complexity	Performance
Base case	4000 states; runtime less than 0.01 second
10x	52k states in 0.03 seconds
100x	270k in 0.13 seconds

On my low-end machine, the verification can churn through 4.5 million states in two minutes. That seems like a huge margin allowing for very complex models to be verified in a reasonable time.

Unless the model needs fairness. Fairness requires that ready-to-run processes will not be starved: eventually such processes will get their turn. This model does not need fairness to be functional but it might happen you need it if the statechart does more than flip on/off a lightbulb boolean. Nevertheless, let’s turn fairness on and have a panic moment:

HSM Model Complexity	Performance
Base case	625k states; runtime 0.3 sec
10x	4.7 million states in 50 seconds
100x	Verification might not complete in this lifetime

That makes an enormous difference. There still is a chance that you will be able to hold your breath without passing out while the verification completes: to let the verification use its partial-order reduction (POR) algorithm. POR is active by default but it can be disabled by some artefacts used in the model:

the provided clause. This is a too-strong synchronization mechanism as it can suspend a process when the clause turns false. A simple statement with its built-in execution semantics is enough here to let events flow from TestEnvironment.
remote variable references: _currentState is now a global that LTL can use instead of reaching directly into processes’ internal variables. It is ugly to expose internals but ugly will save the day.
other constructs not discussed here such as _last, enabled.
finally, the rendez-vous channel mechanism had to be replaced. Again, the synchronization offered by rendez-vous is too strong for what we need (it changes states in two processes in one verification step). Listing 7 and Listing 8 are code snippets of the new model [Melinte-3] that replaces that channel with an _eventProcessed global. More internal details exposed.

short _eventProcessed = idx_unknown;
proctype StateBothOff(chan superChannel)
{
  local event evtRecv;

entry_BothOff:
  _currentState = idx_state_BothOff; //0
  _stateMachineReady = true;

body_BothOff:
end_BothOff: // valid verification's end
  superChannel?evtRecv; 

  //As before

  atomic {
    if
      :: (evtRecv.evId == event_ExitState && 
        evtRecv.toState == idx_state_BothOff) -> 
      _eventProcessed = idx_processed_Processed;
      goto exit_BothOff;
      
      :: (evtRecv.evId == event_LampSwitch) -> 
      /* execute transition actions then on exit
         state BothOff; not UML-compliant */
      send_internal_event(event_EnterState,
                          idx_state_WallOff);
      _eventProcessed = idx_processed_Processed;
      goto exit_BothOff;

      :: (evtRecv.evId == event_WallSwitch) -> 
      send_internal_event(event_EnterState,
        idx_state_LampOff);
      _eventProcessed = idx_processed_Processed;
      goto exit_BothOff;
      
      :: else -> assert(false);
      _eventProcessed = 
        idx_processed_NotProcessed;
      skip;
    fi

    goto body_BothOff;
    } // atomic
exit_BothOff:
    /* execute on exit BothOff actions */
}

Listing 7

proctype Switch(chan superChannel)
{
  local event evtRecv; 
  chan substateChannel = [1] of {event};
    
entry_Switch:
  send_internal_event(event_EnterState, 
    idx_state_BothOff); // initial state

body_Switch:
  if
  :: nempty(_stateMachineInternalChannel) -> 
    _stateMachineInternalChannel?evtRecv
  :: empty(_stateMachineInternalChannel)  -> 
    end_Switch: superChannel?evtRecv;
  fi
      
  atomic {
    _eventProcessed = idx_unknown;
    if
      :: (evtRecv.evId == event_EnterState && 
        evtRecv.toState == idx_state_BothOff) -> 
      run StateBothOff(substateChannel);
      goto body_Switch;

      :: (evtRecv.evId == event_EnterState && 
        evtRecv.toState == idx_state_LampOff) -> 
      run StateLampOff(substateChannel);
      goto body_Switch;

      :: (evtRecv.evId == event_EnterState && 
        evtRecv.toState == idx_state_LightOn) -> 
      run StateLightOn(substateChannel);
      goto body_Switch;

      :: (evtRecv.evId == event_EnterState && 
        evtRecv.toState == idx_state_WallOff) -> 
      run StateWallOff(substateChannel);
      goto body_Switch;

      :: (evtRecv.evId == event_ExitState) -> 
      substateChannel!evtRecv;
      goto body_Switch;

      :: else -> skip; // send to substates for
                       // processing
      // goto body_Switch;
    fi
    
    // send event to substates/regions for
    // processing
    _eventProcessed = idx_unknown;
    substateChannel!evtRecv.evId(_currentState);
    (_eventProcessed != idx_unknown);
    if
    :: (_eventProcessed == idx_processed_Processed) -> goto body_Switch;
    :: else -> skip; // to the transition table
                     // next
    fi
    
    // Attempt to process the event per our 
    // transition table which is empty
    assert(false);
    
    goto body_Switch;
  } // atomic

exit_Switch: 
} // Switch

Listing 8

And look at the difference in performance with POR:

HSM Model Complexity	Performance
10x	130k states in 0.1 sec
100x	655k states in 0.13 sec

More modelling

What about orthogonal regions? Regions can be modelled as processes akin to the other UML-state processes.

History states: one way to model these is to add a _deferredEventsInternalChannel channel to the model for transitions in/out of the history states. Events accumulated in this channel should be processed first, before the _stateMachineInternalChannel and the _stateMachineChannel channel.

Choice and junctions: see [Damjan17] for ideas.

TLA+

The equivalent FSM model has identical logic flow barring the differences in syntax.

An equivalent HSM model will need significant adjustments because TLA cannot dynamically ‘spawn’ processes-cum-UML-states. These will have to be all created at-start and de/activated by events. But more importantly: there is no algorithm in TLA that I know of that is equivalent to POR and the verification engine is Java code: slower by an order of magnitude or two from the get-go. Complexity could kill it.

References

[Damjan17] Panisara Damjan and Wiwat Vatanawood ‘Translating UML State Machine Diagram into Promela’ in Proceedings of the International MultiConference of Engineers and Computer Scientists 2017 Vol I, IMECS 2017, March 15 - 17, 2017, Hong Kong, available at: https://www.iaeng.org/publication/IMECS2017/IMECS2017_pp512-516.pdf

[Melinte-1] Promela SM Models – switch.promela: https://github.com/melintea/upml/blob/main/doc/promela-sm-models/switch.promela

[Melinte-2] Promela SM Models – switch.hsm.promela: https://github.com/melintea/upml/blob/main/doc/promela-sm-models/switch.hsm.promela

[Melinte-3] Promela SM Models – switch.hsm.limites.promela: https://github.com/melintea/upml/blob/main/doc/promela-sm-models/switch.hsm.limits.promela

[QP] Quantum Products: https://www.state-machine.com/products

[Spin] ‘Verifying Multi-threaded Software with Spin’: https://spinroot.com/spin/whatispin.html

[Wikipedia] ‘Büchi automaton’: https://en.wikipedia.org/wiki/B%C3%BCchi_automaton

Aurelian Melinte Aurelian acquired his programming addiction in late 90s as a freshly-minted hardware engineer and is not looking for a cure. He spends most of his spare time reading and exercising.