XAI Program Development
In May, 2017, XAI development began. Eleven research teams were selected
to develop the Explainable Learners (TA1) and one team was selected to
develop the Psychological Models of Explanation. Evaluation was provided
by the Naval Research Lab. The following summarizes those developments
and the final state of this work at the end of the program. An interim
summary of the XAI developments at the end of 2018 is given in Gunning
and Aha, 2019.
XAI Explainable Learner
Approaches
The program anticipated that researchers would examine the training
process, model representations, and, importantly, explanation
interfaces. Three general approaches were envisioned for model
representations. Interpretable model approaches would seek to develop ML
models that were inherently more explainable and more introspectable for
machine learning experts. Deep explanation approaches would leverage
deep learning or hybrid deep learning approaches to produce explanations
in addition to predictions. Finally, model induction techniques would
create approximate explainable models from more opaque, black-box
models. Explanation interfaces were expected to be a critical element of
XAI, connecting a user to the model to enable them to understand and
interact with the decision making process.
As the research progressed, eleven XAI teams explored a number of
machine learning approaches, such as tractable probabilistic models (Roy
et al. 2021) and causal models (Druce et al. 2021) and explanation
techniques such as state machines generated by reinforcement learning
algorithms (Koul et al. 2019, Danesh et al. 2021), Bayesian teaching
(Yang et al. 2021), visual saliency maps (Petsiuk 2021, Li et al. 2021,
Ray et al. 2021, Alipour et al. 2021, Vasu et al. 2021), and network and
GAN dissection (Ferguson et al. 2021). Perhaps the most challenging and
most unique contributions came from the combination of machine learning
and explanation techniques to conduct well-designed psychological
experiments to evaluate explanation effectiveness.
As the program progressed, we also gained a more refined understanding
of the spectrum of users and development timeline (Figure 3).