Please note this event occurred in the past.
October 25, 2024 10:30 am - 12:00 pm ET
Seminar
Stockbridge 303

"The Explainable Edge for Subjective Evaluations? A Field Experiment on AI-Augmented Evaluation of Early-Stage Innovations"

The rise of generative AI has transformed creative problem-solving, necessitating a reevaluation of idea evaluation processes to assess and select from the resulting abundance of solutions effectively. This study investigates how human-AI collaboration can enhance early-stage evaluations of innovative solutions, examining the interplay between objective criteria, which are quantifiable and measurable, and subjective criteria, which are based on personal opinion and intuition. Partnering with MIT Solve, a marketplace for social impact innovation, we conducted a field experiment involving 72 experts and 156 community screeners evaluating 48 real-world solutions for the 2024 Global Health Equity Challenge for a total of 3,002 screener-solution pairs. We utilized GPT-4, a state-of-the-art large language model, to provide recommendations and explanations for screening decisions. Our experiment compared a human-only control condition against two AI-assisted treatments: a black box AI providing recommendations without explanations and an explainable AI offering both recommendations and rationales for its decisions. Our findings reveal that screeners strategically use AI insights, validating AI’s recommendations when they agree and scrutinizing AI recommendations when they disagree. Screeners assisted by AI were roughly 9 percentage points more likely to fail a solution than the control group, primarily influenced by AI’s more stringent failure recommendations. Notably, when AI provided explanations for subjective criteria failures, screeners were 12 percentage points more likely to adhere to these recommendations compared to the black box treatment condition. Data from interviews and mouse tracking reveal that AI explanations for subjective criteria led screeners to doubt their own human judgment and possibly over-rely on AI’s explanations. This research suggests a possible framework for human-AI collaboration in