Researchers have introduced LangFlow, a new framework that shows continuous diffusion can compete with traditional discrete models in language modeling. The method combines embedding-space diffusion with Flow Matching using Bregman divergence, providing a solid theoretical foundation. A key innovation is a new ODE-based evaluation metric for calculating perplexity. To enhance training efficiency, the team proposes an information-uniform noise schedule based on a Gumbel distribution, concentrating training on data regions with the highest information gain. This work challenges the dominance of discrete approaches and opens new directions for generative language models.
LangFlow: Continuous Diffusion Matches Discrete Models in Language Modeling
AI
May 2, 2026 · 3:44 PM