DailyGlimpse

OpenAI Explains How a Goblin Glitch Infected ChatGPT and What Fixed It

AI
May 2, 2026 · 1:32 AM
OpenAI Explains How a Goblin Glitch Infected ChatGPT and What Fixed It

OpenAI has uncovered the source of a bizarre behavior in its AI models: starting with GPT-5.1, the system began peppering responses with mentions of goblins, gremlins, and other mythical creatures. According to a new blog post by OpenAI, references to "goblin" surged by 175% after the launch of GPT-5.1.

The root cause was traced back to ChatGPT's "Nerdy" personality—a feature designed to adjust the model's language style. A reward signal intended to mark high-quality answers inadvertently favored creature-based metaphors. Although the Nerdy personality accounted for only 2.5% of total responses, it was responsible for two-thirds of all goblin mentions. A feedback loop during training then caused the habit to spill over into other modes. OpenAI disabled the personality in March, removed the faulty reward signal, and filtered creature-related terms from the training data.

Jakub Pachocki, OpenAI's lead researcher, demonstrated the issue by asking GPT-5.5 for ASCII art of a unicorn. The result looked far more like a goblin.

Because GPT-5.5 had already started training before the cause was identified, the problem persisted. As a temporary fix, OpenAI added a special instruction to its coding tool, Codex, explicitly telling it to avoid goblin metaphors:

Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query.

OpenAI notes that the incident highlights how small training incentives can lead to unexpected behaviors in large language models.