References

Akula, Arjun R., et al. ”CX-ToM: Counterfactual Explanations with Theory-of-Mind for Enhancing Human Trust in Image Recognition Models.”arXiv preprint arXiv: 2109.01401 (2021) (accepted to iScience 2021).
Akula, Arjun R., et al. ”CoCoX: Generating Conceptual and Counterfactual explanations via Fault-Lines.” AAAI, 2020.
Bau, David, et al. ”Gan dissection: Visualizing and understanding generative adversarial networks.” arXiv preprint arXiv:1811.10597(2018).
Cellan-Jones, R. (2014). Stephen Hawking warns artificial intelligence could end mankind. BBC news , 2 (10), 2014.
Danesh, Mohamad H., et al. ”Re-understanding Finite-State Representations of Recurrent Policy Networks.” International Conference on Machine Learning . PMLR, 2021.
Dikkala, Rupika, et al. ”Doing Remote Controlled Studies with Humans: Tales from the COVID Trenches.” ACM-IEEE CHASE. 2021.
Edmonds, Mark, et al. ”A tale of two explanations: Enhancing human trust by explaining robot behavior.” Science Robotics 4.37 (2019).
Folke, Tomas, et al. ”Explainable AI for medical imaging: explaining pneumothorax diagnoses with Bayesian teaching.” arXiv preprint arXiv:2106.04684 (2021).
Gibbs, S. Elon Musk leads 116 experts calling for outright ban of killer robots. The Guardian , 20 , 2017.
Gunning, David, and David Aha. ”DARPA’s explainable artificial intelligence (XAI) program.” AI Magazine 40.2 (2019): 44-58.
Johnson, W. Lewis. ”Agents that Learn to Explain Themselves.”AAAI . 1994.
Jordan, Michael I., and Tom M. Mitchell. ”Machine learning: Trends, perspectives, and prospects.” Science 349.6245 (2015): 255-260.
Khorram, Saeed, Tyler Lawson, and Li Fuxin. ”iGOS++ integrated gradient optimized saliency by bilateral perturbations.” Proceedings of the Conference on Health, Inference, and Learning . 2021.
Anurag Koul, Alan Fern, and Sam Greydanus. “Learning Finite State Representations of Recurrent Policy Networks.” International Conference on Learning Representations. 2019
Kulesza, Todd, et al. ”Principles of explanatory debugging to personalize interactive machine learning.” Proceedings of the 20th international conference on intelligent user interfaces . 2015
Lacave, Carmen, and Francisco J. Díez. ”A review of explanation methods for Bayesian networks.” The Knowledge Engineering Review 17.2 (2002): 107-127.
Letham, Benjamin, et al. ”Interpretable classifiers using rules and bayesian analysis: Building a better stroke prediction model.” The Annals of Applied Statistics 9.3 (2015): 1350-1371.
Zhengxian Lin, Kim-Ho Lam, and Alan Fern. “Contrastive Explanations for Reinforcement Learning via Embedded Self Predictions.”