Laravel

In a recent demonstration, a developer showcased an agentic use of local language models, enabling an LLM to interact with video games and autonomously write code. The video, titled "IA AGÉNTICA 🤖 Un LLM que JUEGA videojuegos y CODIFICA solo," explores how a model can analyze images, reason about its environment, and execute actions using tools.

Testing the Model in GTA

The test environment included Grand Theft Auto (GTA), where the LLM was given the task of navigating the game world. The model processed visual input from the game screen and issued keyboard commands to control the character. Early results showed the model could perform basic movements and interact with objects, though with occasional errors.

Code Harness and Context Importance

The setup involved a custom code harness that provided the model with contextual information about the game state and available actions. By maintaining a structured prompt with the current game status, the model could make more informed decisions. The developer emphasized that the quality of context greatly influenced the model's performance.

Comparing Models and Improving Reasoning

Different LLMs were tested, including those optimized for reasoning. The developer noted that models with stronger reasoning capabilities performed better in complex scenarios, such as planning a route or avoiding obstacles. Techniques like chain-of-thought prompting further improved the model's ability to break down tasks.

Intercepting the Chain of Thought

To better understand the model's decision-making, the developer modified the code to intercept the chain-of-thought output. This allowed real-time inspection of the model's reasoning process, revealing how it interpreted visual cues and selected actions.

Setting Up Node.js and MCPs

Part of the demonstration involved installing Node.js and configuring Model Context Protocols (MCPs) to enable the LLM to interact with external tools. This allowed the model to access a web search API (Brave Search) for autonomous research, expanding its capabilities beyond the game environment.

Integration of Brave Search

By integrating Brave Search, the LLM could look up information (e.g., game maps or coding syntax) independently. This showcased a shift from a static model to an agentic system that can gather and apply knowledge in real time.

Extracting Reasoning in llama.cpp with DeepSeek Templates

Finally, the developer demonstrated how to extract the model's reasoning using llama.cpp and DeepSeek-style templates. This technique allowed for cleaner separation of reasoning tokens and action outputs, making the system more modular and debuggable.

Overall, the video illustrates the growing potential of local LLMs as autonomous agents capable of performing complex, multi-step tasks—from playing video games to writing code—without constant human intervention.

Agentic AI: A Local LLM That Plays Video Games and Codes on Its Own