Fault tolerant architectures for reactive critical systems

Recent advances in digital networks and smart sensors and actuators offer new opportunities for the development of innovative fault tolerant architectures for critical control systems that have to satisfy hard dependability, performance and cost reduction requirements. The main issue is to take advantage of the increasing computing capabilities at the sensors and actuators levels to achieve an optimal distribution of systems functionalities and fault tolerance mechanisms among computers and actuators.

We have investigated this problem in the context of Flight Control Systems (FCS). We proposed distributed and reconfigurable architectures that break with the traditional COM/MON centralized-federated architectures where specific fault-tolerant computers perform all processing. The new alternative FCS architectures ([1] and [2]) are based on: simplex computers, distribution of system functionalities between computers and actuators, less hardware and software resources while meeting the same (or even better) level of safety and availability requirements as in the prior art. Two architectures have been investigated [3][4]  - the “Massive Voting” (see Figure 1) and the “Priority voting” architectures - validated through respectively OCAS/AltaRica (for safety requirements) and Matlab/Simulink simulations (for robustness requirements). This work was carried out with Airbus-France.

We recently restarted work ([7] and [8]) on integrity in communication networks, work based on our previous study [5][6]. Our goals are to deepen and extend an innovative protection method at the application level. It deals with multiple permanent errors in critical systems characterized by “slow dynamics” (e.g.: Flight Control Systems). “Slow dynamics” enable to address the integrity of “a set of messages”, instead of every single message. Distinct error detection functions featuring “complementary” error detection capabilities are applied to consecutive messages (see Figure 2). Several challenges arise: i) proving the property of complementarity of the selected codes (e.g.: Fast CRC, Adler and Fletcher). We will study the case of codes belonging to the same family and the case of codes belonging to different families; ii) Increasing error detection capabilities having highly limited redundancy and resources use. Indeed, considered systems are critical so they require a high level of integrity but the embedded property induces limited resources and short messages, that’s why we must take into account this trade-off; iii) Validating the proposed approach: analytically and/or by simulations; iv) Finding applications in other fields having the same requirements and features as those of FCS (e.g.: automotive applications).

Figure 1: Massive Voting Architecture

Cumulative errors detection approach

Figure 2: Cumulative Error Detection Approach


