Trustworthy AI
Machine learning models are increasingly used in different types of systems, e.g., to implement a perception function in an autonomous system or to provide personalized recommendations in a social network. These examples illustrate different notions of trust that are being explored within the team.
Different notions of trust are needed depending on whether we are considering the use of machine learning models in critical autonomous systems or in social networks.
In the first case, trust is relative to the satisfaction of traditional safety properties. Our work focuses on how to detect errors in the machine learning model (e.g., a pedestrian is not detected) [1, 2], how to design protection mechanisms against these errors, and how to evaluate the safety gain [3].
In the second case, trust is based on more societal expectations, such as transparent and non-discriminatory decision making [4, 5]. However, this trust is problematic because the machine learning model is typically inaccessible and experienced as a black box. We develop auditing methods to infer properties about the nature and/or behavior of the model from partial observations of that behavior. Some of this work on auditing is done in collaboration with legal experts [6] and government services (PEReN: Pôle d'Expertise de la Régulation Numérique).
Références
[1] Raul Sena Ferreira, Joris Guérin, Jérémie Guiochet, Hélène Waeselynck, "SENA : Similarity-Based Error-Checking of Neural Activations", Proc. 26th European Conference on Artificial Intelligence (ECAI 2023), pp. 724-731, 2023.
[2] Joris Guérin, Kevin Delmas, Raul Sena Ferreira, Jérémie Guiochet, "Out-of-Distribution Detection Is Not All You Need", Proc. 37th AAAI Conference on Artificial Intelligence (AAAI 2023), pp. 14829-14837, 2023.
[3] Joris Guérin, Raul Sena Ferreira, Kevin Delmas, Jérémie Guiochet, "Unifying Evaluation of Machine Learning Safety Monitors", Proc. IEEE 33rd International Symposium on Software Reliability Engineering (ISSRE 2022), pp. 414-422, 2022.
[4] Erwan Le Merrer, Gilles Trédan, "Remote explainability faces the bouncer problem", Nature Machine Intelligence, vol. 2, no.9, pp. 529-539, 2020.
[5] Erwan Le Merrer, Benoît Morgan, Gilles Trédan, "Setting the Record Straighter on Shadow Banning", Proc. 40th IEEE Conference on Computer Communications (INFOCOM 2021), 2021.
[6] Erwan Le Merrer, Ronan Pons, Gilles Trédan, "Algorithmic audits of algorithms, and the law", AI and Ethics, vol. 4, no. 4, pp. 1365-1375, 2024.