When it comes to finding ways to leverage artificial intelligence and machine learning (AI/ML), analysts are frequently overwhelmed by choice. Often, finding the right ways to leverage AI/ML requires in-depth technical knowledge that is hard to come by.
A significant challenge that faces the community and AI/ML practitioners is how to move these tools from development to the real world. The biggest hang-up is that AI/ML models are inherently a “black box.”
The black box makes it hard for users and practitioners alike to explain what happens between input and output. We understand that AI/ML models are capable of superhuman computing, yet we are unable to clearly understand the logic behind an AI agent’s decision making. A team of researchers at Assured Information Security (AIS) is beginning to chip away at this problem by leading cutting-edge research in explainable AI.
Demystifying the black box
By their nature, AI and AI-driven agents are inherently unexplainable. AI lacks the ability to provide context and explain its decision-making process. This lack of transparency immediately introduces uncertainty and forces potential adopters to question trustworthiness. For example, if a model or algorithm arrives at the correct answer 90 percent of the time, we must be able to understand why the algorithm fails the remaining 10 percent of the time in order to address the algorithm’s limitations. In the absence of this understanding, a model’s failures can seem random, which is unacceptable, especially if the goal is to apply AI/ML to high-risk or high-stakes applications. Thus, leaders are lukewarm when it comes to not only developing AI, but also more importunately, deploying it.
Because AI lacks the ability to answer the important questions (what, when, where, how, and why), we lack the ability to realize the entirety of its benefits. These answers are hidden within the difficult-to-open black box.
A team of researchers, made up of AIS, Georgia Tech Research Institute (GTRI) and the Georgia Institute of Technology (GT), are taking a “psychology for AI” approach to tackle this problem. First, it is important that we make a clear distinction between interpretable and explainable AI. Interpretable AI is the relationship between model inputs and model outputs, or one’s ability to predict outputs. Explainable AI (XAI) is one’s ability to understand the what, when, where, how, and why of an agent’s decision making or computing process. The goal of XAI is to eliminate the black box between inputs and outputs, resulting in a transparent sequence from data going into the model to results coming out.
Progression of AI-enabled technologies
In recent years, XAI has gained significant momentum and enthusiasm from the Department of Defense (DoD) contractor community to progress the deployment of AI-enabled technologies. Our goal is not only to better bridge the gap (interpret), but also to enable exploration of the bridge between inputs and outputs (explain). We hypothesize that there will be a direct correlation between exploration and trust. As explainability matures, trust will continue to grow between the operational community and these AI/ML-enabled technologies. Though we may be far from the adoption of fully autonomous systems, significant benefits can be realized with AI-enabled, human-in-the-loop technologies.
XAI happenings at AIS
Team AIS is currently pushing explainability further by leveraging a combination of reinforcement learning (RL), world-models, and counterfactuals (what-ifs). Reinforcement learning looks at how agents should take actions to maximize reward. In other words, the more accurate the agent’s action choices, the higher the reward it receives. World-models are abstract representations of the AI agent’s environment. Counterfactuals, or “what-ifs,” are simply alternative choices.
This research seeks to develop an explainability method to build user trust in the policy decision process. To do this, we leverage forward projection, meaning the RL agent can explore varying paths without having to take them. Combining the forward projection capability with the counterfactual analysis allows us to visualize the on-policy action, or the factual, along with a series of off policy actions, or counterfactuals. These counterfactuals allow us to see alternative actions an agent could have taken, as compared to the action the agent chose to take. This visualization allows us to analyze and better understand some of the logic behind the agent’s decision-making process.
Analyzing the counterfactuals: The robot example
Take, for example, a simple robot using AI to move between floors of a building. Say you have an AI-enabled robot whose job is to navigate from the second floor of a building to the first floor. This robot can take either the stairs or the elevator. The stairs are the quickest route to the exit versus the elevator, which is some distance away. Suppose the robot chooses to take the elevator over the stairs, even though we would have anticipated that the robot would have chosen the stairs because of the proximity and speed to exit. Through our analysis of the counterfactuals, we can learn why the robot chose the elevator: its counterfactuals show that had it taken the stairs, it would have fallen. Despite the stairs being the most optimal path for speed, we as users can now see that the agent favors the route that allows it to successfully complete the task.
This research continues to show success and viability toward becoming a reliable method for explainable AI.
Jeff Durst is a senior research scientist at Assured Information Security (AIS). Andres Colon, research scientist III at AIS, also contributed to this article. Founded in 2001 and headquartered in Rome, AIS is a cyber and information-security company that provides government and commercial customers with capabilities and services such as research, development, consulting, testing, forensics, remediation, and training.