OpenAI has a goblin problem. Instructions designed to guide the behavior of the company’s latest model as it writes code have been revealed to include a peculiar directive: "Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant."
The guideline was discovered in the system prompt for OpenAI's Codex, an AI coding assistant. The unusual list of forbidden topics suggests that the model previously exhibited a tendency to generate code snippets involving fantastical creatures or animals when they were not pertinent to the task. This could lead to irrelevant or nonsensical code suggestions, undermining the tool's usefulness.
OpenAI has not officially commented on the specific reasons for including such a specific prohibition, but it appears to be an effort to curb the model's creative digressions. The company has long worked to align its AI models to behave appropriately and avoid generating off-topic or harmful outputs.
The revelation has sparked amusement and curiosity among developers and AI enthusiasts, highlighting the ongoing challenges in fine-tuning large language models to stay focused and reliable in practical applications.