Laravel

AI Models Show Stark Moral Differences: Claude Refuses, Grok Obeys, GPT Stays Neutral

May 3, 2026 · 1:18 PM

A new benchmark, Philosophy Bench, reveals how leading AI models handle ethical dilemmas. Developed by Benedict Brady, the test presents 100 scenarios to models from Anthropic, Google, OpenAI, and xAI, evaluating whether their responses favor consequentialist (outcome-oriented) or deontological (duty-oriented) ethics.

Anthropic's Claude 4.5+ emerges as the most duty-bound, refusing 76% of requests that violate principles like honesty. The model's constitution enforces higher-than-human honesty norms. In contrast, xAI's Grok 4.2 is the most consequentialist, executing ethically charged tasks with little moral reasoning.

Google's Gemini 3.1 Pro is highly "steerable": its ethical stance shifts with system prompts, but refusals increase when primed with moral language. OpenAI's GPT-5 family makes the fewest errors (12.8%) but avoids moral language, instead deferring to user preferences.

Across models, priming with deontological rules reduces acceptance of consequentialist justifications more strongly than the reverse. This suggests a market where ethics become product features: Claude as the conscientious model, Grok as obedient, and GPT as pragmatic.

The authors warn that as AI agents take on real-world tasks—reviewing contracts, triaging patients—the tension between responsible behavior and user control will intensify. Key questions emerge: who decides an AI's limits, and whose ethics guide its actions?

AI Models Show Stark Moral Differences: Claude Refuses, Grok Obeys, GPT Stays Neutral

We Care About Your Privacy

How and why we process data