Novel AI agent can rationalize its actions

An AI agent developed at the Georgia Institute of Technology automatically generates natural language explanations in real-time to explain the motivations behind its actions, ideally allowing those who aren’t experts in the field to interact with AI tools more confidently.

The project was spearheaded by Upol Ehsan, a PhD candidate in the School of Interactive Computing at Georgia Tech.

“There is almost nothing artificial about artificial intelligence,” Ehsan wrote in a blog post dedicated to the new tech. “It is designed by humans for humans, from training to testing to usage. Therefore, it’s essential that AI systems are human-understandable. Sadly, as AI-powered systems get more complex, their decision-making tends to be more ‘black-boxed’ to the end-user, which inhibits trust.”

Ehsan and his colleagues, alongside researchers from Cornell University and the University of Kentucky, designed an AI agent that could play the classic arcade game Frogger and generate on-screen explanations to justify its actions in the game. The goal of Frogger is to get a cartoon frog home safely without being hit by vehicles or drowned in a river.

A group of study participants spectated as the AI played the game and were asked to rate and rank three on-screen rationales for each of the AI’s moves. One explanation was written by a human, one was AI-generated and one was generated randomly; all were judged based on confidence, human-likeness, adequate justification for the action and their understandability.

Ehsan and his team reported that while human-generated responses still took the cake as the most preferred by participants, AI-generated explanations were a close second. AI-generated rationales were ranked higher by participants when they demonstrated recognition of environmental conditions and adaptability and when they communicated awareness of upcoming dangers and planned ahead for them. Responses that were redundant or stated the obvious were ranked lowest.

A follow-up study took humans out of the equation, asking participants to rank a set of AI-generated responses by which they preferred in a scenario where the AI made a mistake or behaved unexpectedly. Rationales were either concise and targeted or holistic and focused more on the context of the game.

Participants favored answers that were holistic by a 3-to-1 margin, suggesting they appreciated the AI thinking about future steps rather than making decisions in the moment.

“This project provided a foundational understanding of AI agents that can mimic thinking out loud,” Ehsan wrote. “Possible future directions include understanding what happens when humans can contest an AI-generated explanation. Researchers will also look at how agents might respond in different scenarios, such as during an emergency response or when aiding teachers in the classroom.”

Ehsan et al.’s work was presented at the Association for Computing Machinery’s Intelligent User Interfaces 2019 Conference.

""

After graduating from Indiana University-Bloomington with a bachelor’s in journalism, Anicka joined TriMed’s Chicago team in 2017 covering cardiology. Close to her heart is long-form journalism, Pilot G-2 pens, dark chocolate and her dog Harper Lee.

Around the web

The tirzepatide shortage that first began in 2022 has been resolved. Drug companies distributing compounded versions of the popular drug now have two to three more months to distribute their remaining supply.

The 24 members of the House Task Force on AI—12 reps from each party—have posted a 253-page report detailing their bipartisan vision for encouraging innovation while minimizing risks. 

Merck sent Hansoh Pharma, a Chinese biopharmaceutical company, an upfront payment of $112 million to license a new investigational GLP-1 receptor agonist. There could be many more payments to come if certain milestones are met.