Fault tolerant architectures for reactive critical systems

Recent advances in digital networks and smart sensors and actuators offer new opportunities for the development of innovative fault tolerant architectures for critical control systems that have to satisfy hard dependability, performance and cost reduction requirements. The main issue is to take advantage of the increasing computing capabilities at the sensors and actuators levels to achieve an optimal distribution of systems functionalities and fault tolerance mechanisms among computers and actuators.

We have investigated this problem in the context of Flight Control Systems (FCS). We proposed distributed and reconfigurable architectures that break with the traditional COM/MON centralized-federated architectures where specific fault-tolerant computers perform all processing. The new alternative FCS architectures ([1] and [2]) are based on: simplex computers, distribution of system functionalities between computers and actuators, less hardware and software resources while meeting the same (or even better) level of safety and availability requirements as in the prior art. Two architectures have been investigated [3][4]  - the “Massive Voting” (see Figure 1) and the “Priority voting” architectures - validated through respectively OCAS/AltaRica (for safety requirements) and Matlab/Simulink simulations (for robustness requirements). This work was carried out with Airbus-France.

We recently restarted work ([7] and [8]) on integrity in communication networks, work based on our previous study [5][6]. Our goals are to deepen and extend an innovative protection method at the application level. It deals with multiple permanent errors in critical systems characterized by “slow dynamics” (e.g.: Flight Control Systems). “Slow dynamics” enable to address the integrity of “a set of messages”, instead of every single message. Distinct error detection functions featuring “complementary” error detection capabilities are applied to consecutive messages (see Figure 2). Several challenges arise: i) proving the property of complementarity of the selected codes (e.g.: Fast CRC, Adler and Fletcher). We will study the case of codes belonging to the same family and the case of codes belonging to different families; ii) Increasing error detection capabilities having highly limited redundancy and resources use. Indeed, considered systems are critical so they require a high level of integrity but the embedded property induces limited resources and short messages, that’s why we must take into account this trade-off; iii) Validating the proposed approach: analytically and/or by simulations; iv) Finding applications in other fields having the same requirements and features as those of FCS (e.g.: automotive applications).

Figure 1: Massive Voting Architecture

Cumulative errors detection approach

Figure 2: Cumulative Error Detection Approach

Publications

[1] M. Sghairi, Architectures innovantes de systèmes de commandes de vol, PhD INP Toulouse, 27 May 2010, http://tel.archives-ouvertes.fr/tel-00509156/fr/

[2] M. Sghairi, A. de Bonneval, Y. Crouzet, J-J. Aubert, P. Brot, Y. Laarouchi, "Distributed and Reconfigurable Architecture for Flight Control System", 28th Digital Avionics Systems Conference (DASC'09), Orlando (USA), 25-29 Oct. 2009, Publisher: IEEE Computer Society, ISBN 978-1-4244-4078-8, DOI 10.1109/DASC.2009.5347447, pp. 6.B.2-01 to 6.B.2-10 

[3] M. Sghairi, P. Brot, J-J. Aubert, A. de Bonneval, Y. Crouzet, "Système de commande de vol et aéronef le comportant", LAAS Report N° 09070, 13p,  Brevet conjoint AIRBUS France et CNRS. Numéros et Dates de publication: Dépôt INPI : FR20090050831 - 10/02/2009 — France : FR2941913 ((A1 et B1) - 13/08/2010 et 31/08/2012 (brevet accepté) ; Europe : EP2216245 (A1) - 11/08/2010; USA : US2010222943 (A1) - 02/09/2010, puis US8761969  (B2) - 24/06/2014 (brevet accepté)

[4] M. Sghairi, P. Brot, J-J. Aubert, A. de Bonneval, Y. Crouzet, "Système de commande de vol et aéronef le comportant", LAAS Report N° 09069, 13p, Brevet conjoint AIRBUS France et CNRS. Numéros et Dates de publication: Dépôt INPI :  FR20090050830 - 10/02/2009; France : FR2941912 (A1 et B1) - 13/08/2010   et   18/02/2011 (brevet accepté); Europe : EP2216244 (A1 et B1) - 11/08/2010   et   01/02/2012 (brevet accepté); USA : US2010204853 (A1) - 12/08/2010

[5] A. Youssef, A. de Bonneval, Y. Crouzet, J-J. Aubert, P. Brot, "Détection d’erreurs dans les données concernant l’actionnement d’un organe de véhicule", Brevet conjoint AIRBUS France et CNRS. Numéros et Dates de publication: Dépôt INPI:  FR20040012141 - 16/11/2004; France : FR2878097 (A1 et B1) - 19/05/2006  et 16/02/2007 (brevet accepté); International : WO2006053956 (A1) - 26/05/2006; Canada : CA2587503 (A1) - 26/05/2006; Inde : 2105/CHENP/2007 – 2007/09/07; USA : US2009319122 (A1) - 24/12/2009, puis US8239075  (B2) - 07/08/2012 (brevet accepté)

[6] A. Youssef, Y. Crouzet, A. de Bonneval, J. Arlat, J-J Aubert, P. Brot, "Communication Integrity in Networks for Critical Control Systems", 6th European Dependable Computing Conference (EDCC-6), Coimbra, Portugal, october 18-20, 2006, pp 23-32, (IEEE CS Press)

[7] A. Zammali, A. de Bonneval, Y. Crouzet, "A Multi-function Error Detection Policy to Enhance Communication Integrity in Critical Embedded Systems",  8th International Conference on Software Security and Reliability (SERE 2014), June 30-July 2, 2014, San Francisco, California, USA, pp. 19-24, LAAS Report N° 14156

[8] A. Zammali, A. de Bonneval, Y. Crouzet, "Communication Integrity for Slow-Dynamic Critical Embedded Systems", International Conference on Computer Safety, Reliability and Security (SafeComp) 2013, september 24-27, 2013, Toulouse (France), 2p., LAAS Report N° 13569

Back to TSF Research Topics page