State of the world, AI, and XAI after DARPA program in
2021
There currently is no universal solution to XAI. As discussed earlier,
different user types require different types of explanations. This is no
different from what we face interacting with other humans. Consider, for
example, a doctor needing to explain a diagnosis to a fellow doctor, a
patient, or a medical review board. Perhaps future XAI systems will be
able to automatically calibrate and communicate explanations to a
specific user within a large range of user types, but that is still
significantly beyond the current state of the art.
One of the challenges in developing XAI is measuring the effectiveness
of an explanation. DARPA’s XAI effort has helped develop foundational
technology in this area, but much more needs to be done, including
drawing more from the human factors and psychology communities. Measures
of explanation effectiveness need to be well established, well
understood, and easily implemented by the developer community in order
for effective explanations to become a core capability of ML systems.
UC Berkeley’s result (Kim et al. 2021) demonstrating that advisability,
the ability for an AI system to take advice from a user, improves user
trust beyond explanations is intriguing. Certainly, users will likely
prefer systems where they can quickly correct the behavior of a system
in the same ways that humans can provide feedback to each other. Such
advisable AI systems that can both produce and consume explanations will
be key to enabling closer collaborations between humans and AI systems.
Close collaboration is required across multiple disciplines including
computer science, machine learning, artificial intelligence, human
factors, and psychology, among others, in order to effectively develop
XAI techniques. This can be particularly challenging, as researchers
tend to focus on a single domain and often need to be pushed to work
across domains. Perhaps in the future a XAI-specific research discipline
will be created at the intersection of multiple current disciplines.
Towards this end, we have worked to create an Explainable AI Toolkit
(XAITK), which collects the various program artifacts (e.g. code,
papers, reports, etc.) and lessons learned from the four-year DARPA XAI
program into a central, publicly accessible location (Hu et al. 2021).
We believe the toolkit will be of broad interest to anyone who deploys
AI capabilities in operational settings and needs to validate,
characterize and trust AI performance across a wide range of real-world
conditions and application areas.
Today we have a more nuanced, less dramatic, and, perhaps, more accurate
understanding of AI, than we had in 2015. We certainly have a more
accurate understanding of the possibilities and the limitations of deep
learning. The AI apocalypse has faded from an imminent danger to a
distant curiosity. Similarly, The XAI program has produced a more
nuanced, less dramatic, and, perhaps, more accurate understanding of
XAI. The program certainly acted as a catalyst to stimulate XAI research
(both inside and outside of the program). The results have produced a
more nuanced understanding of XAI uses and users, the psychology of XAI,
the challenges of measuring explanation effectiveness, as well as
producing a new portfolio of XAI ML and HCI techniques. There is
certainly more work to be done, especially as new AI techniques are
developed that will continue to need explanation. XAI will continue as
an active research area for some time. The authors believe that the XAI
program has made a significant contribution by providing the foundation
to launch that endeavor.