State of the world, AI, and XAI after DARPA program in 2021

There currently is no universal solution to XAI. As discussed earlier, different user types require different types of explanations. This is no different from what we face interacting with other humans. Consider, for example, a doctor needing to explain a diagnosis to a fellow doctor, a patient, or a medical review board. Perhaps future XAI systems will be able to automatically calibrate and communicate explanations to a specific user within a large range of user types, but that is still significantly beyond the current state of the art.
One of the challenges in developing XAI is measuring the effectiveness of an explanation. DARPA’s XAI effort has helped develop foundational technology in this area, but much more needs to be done, including drawing more from the human factors and psychology communities. Measures of explanation effectiveness need to be well established, well understood, and easily implemented by the developer community in order for effective explanations to become a core capability of ML systems.
UC Berkeley’s result (Kim et al. 2021) demonstrating that advisability, the ability for an AI system to take advice from a user, improves user trust beyond explanations is intriguing. Certainly, users will likely prefer systems where they can quickly correct the behavior of a system in the same ways that humans can provide feedback to each other. Such advisable AI systems that can both produce and consume explanations will be key to enabling closer collaborations between humans and AI systems.
Close collaboration is required across multiple disciplines including computer science, machine learning, artificial intelligence, human factors, and psychology, among others, in order to effectively develop XAI techniques. This can be particularly challenging, as researchers tend to focus on a single domain and often need to be pushed to work across domains. Perhaps in the future a XAI-specific research discipline will be created at the intersection of multiple current disciplines. Towards this end, we have worked to create an Explainable AI Toolkit (XAITK), which collects the various program artifacts (e.g. code, papers, reports, etc.) and lessons learned from the four-year DARPA XAI program into a central, publicly accessible location (Hu et al. 2021). We believe the toolkit will be of broad interest to anyone who deploys AI capabilities in operational settings and needs to validate, characterize and trust AI performance across a wide range of real-world conditions and application areas.
Today we have a more nuanced, less dramatic, and, perhaps, more accurate understanding of AI, than we had in 2015. We certainly have a more accurate understanding of the possibilities and the limitations of deep learning. The AI apocalypse has faded from an imminent danger to a distant curiosity. Similarly, The XAI program has produced a more nuanced, less dramatic, and, perhaps, more accurate understanding of XAI. The program certainly acted as a catalyst to stimulate XAI research (both inside and outside of the program). The results have produced a more nuanced understanding of XAI uses and users, the psychology of XAI, the challenges of measuring explanation effectiveness, as well as producing a new portfolio of XAI ML and HCI techniques. There is certainly more work to be done, especially as new AI techniques are developed that will continue to need explanation. XAI will continue as an active research area for some time. The authors believe that the XAI program has made a significant contribution by providing the foundation to launch that endeavor.