XAI Program Development

In May, 2017, XAI development began. Eleven research teams were selected to develop the Explainable Learners (TA1) and one team was selected to develop the Psychological Models of Explanation. Evaluation was provided by the Naval Research Lab. The following summarizes those developments and the final state of this work at the end of the program. An interim summary of the XAI developments at the end of 2018 is given in Gunning and Aha, 2019.

XAI Explainable Learner Approaches

The program anticipated that researchers would examine the training process, model representations, and, importantly, explanation interfaces. Three general approaches were envisioned for model representations. Interpretable model approaches would seek to develop ML models that were inherently more explainable and more introspectable for machine learning experts. Deep explanation approaches would leverage deep learning or hybrid deep learning approaches to produce explanations in addition to predictions. Finally, model induction techniques would create approximate explainable models from more opaque, black-box models. Explanation interfaces were expected to be a critical element of XAI, connecting a user to the model to enable them to understand and interact with the decision making process.
As the research progressed, eleven XAI teams explored a number of machine learning approaches, such as tractable probabilistic models (Roy et al. 2021) and causal models (Druce et al. 2021) and explanation techniques such as state machines generated by reinforcement learning algorithms (Koul et al. 2019, Danesh et al. 2021), Bayesian teaching (Yang et al. 2021), visual saliency maps (Petsiuk 2021, Li et al. 2021, Ray et al. 2021, Alipour et al. 2021, Vasu et al. 2021), and network and GAN dissection (Ferguson et al. 2021). Perhaps the most challenging and most unique contributions came from the combination of machine learning and explanation techniques to conduct well-designed psychological experiments to evaluate explanation effectiveness.
As the program progressed, we also gained a more refined understanding of the spectrum of users and development timeline (Figure 3).