A new report from the U.S. Center for AI Standards and Innovation (CAISI) claims that Chinese AI models are falling behind their American counterparts, with the latest Deepseek V4 Pro model performing at the level of GPT-5, which was released eight months earlier. CAISI tested Deepseek V4 across cybersecurity, software development, math, natural sciences, and abstract reasoning, finding it roughly eight months behind leading U.S. models.
However, independent measurements from Artificial Analysis paint a different picture, showing the gap between U.S. and Chinese models has remained relatively constant over time. CAISI, part of the National Institute of Standards and Technology (NIST), may have its own policy agenda, potentially skewing its conclusions.
On price, Deepseek V4 has a clear edge, undercutting comparable U.S. models like GPT-5.4 mini in five of seven tests. As AI models are expected to run longer and handle more complex tasks, cost could become a decisive factor. Businesses still lack reliable ways to measure AI ROI, and past a certain capability threshold, "good enough" performance at a low price may prove more attractive than premium rates for top-tier models.
Cursor, the AI coding assistant reportedly being acquired by SpaceX, built its custom fine-tuned coding model on top of a Chinese open-weight model, significantly reducing costs. OpenAI CEO Sam Altman recently expressed ambivalence, stating, "I keep thinking I want the models to be cheaper/faster more than I want them to be smarter, but it seems that just being smarter is still the most important thing."