Early error detection for real time applications

The focus of this work is error detection in real-time applications and our objective is to decide as early as possible when an application won’t obey its real-time constraints. Our approach is based on runtime verification. 

Runtime verification of formal specifications provides the means to generate error detectors with detection capabilities depending mostly on the kind of formalism considered. The stronger the formalism is the easier separation between correct and erroneous execution is. Nevertheless, two recurring issues have to be considered before using such error detection mechanisms. First, the cost, at run-time, of such error detector has to be assessed. Then, we have to ensure that the execution of such detectors has a well defined semantics.

This work aims at better understanding the conditions of use of such detectors within critical real-time software application. Given formal behavioural specification, we defined the notion of "behavioural error". Then, we identify the class of early detectors that optimize the detection latency between the occurrence of such errors and their signalling. The whole generation process has been implemented for specifications provided as timed automata.

The prototype achieves early error detection thanks to a pre-processing of the automaton to generate its temporal abstraction. Our contributions are three fold : formalisation of early detection, algorithms for timed automata run-time verification, and prototyping of such detectors on a real-time kernel, Xenomai.

This EED (Early Error Detection) approach can be applied to conventional pattern of fault tolerance software (for instance Recovery Block) and in fact used to handle degraded modes of operation (implemented as a sequence of blocks) in real-time applications.  In the current state of the work, each block has independent specification and thus timing constraints. It would be of high interest to use hierarchical timed automata, modelling the whole design pattern behaviour. This means that the whole set of possible blocks is part of a single model in which global timing constraints can be defined. In this model, one can easily identify the modelling of each block, the transition between each block models being the EED special event. The benefit here is that the block to be activated next can be selected according to the remaining time, something that was not possible in the independent block modelling approach.

The next step of the work is to address this problem for distributed real-time systems.