This blog post explores constrained beam search, a powerful extension of traditional beam search for text generation that allows developers to enforce specific words or phrases in the output. This technique is especially useful in tasks like machine translation, where you may want to control formality or include domain-specific terms.
Why Constrained Beam Search is Challenging
Standard beam search generates text token by token, predicting the next token based on the current sequence. The difficulty with constraints arises because the model needs to know when and where to insert required tokens. For example, if you want the phrase "Sie" to appear in a German translation, the model must decide at which step to start that token, and it must ensure the entire constraint is met.
Moreover, handling multiple constraints—some mandatory, some optional—adds complexity. For instance, you might need to force two phrases in order, or allow the model to choose between alternative phrases.
Example 1: Forcing a Word
Consider translating "How old are you?" from English to German. The informal translation is "Wie alt bist du?", while the formal version is "Wie alt sind Sie?". With traditional beam search, the output is:
Output:
Wie alt bist du?
Using constrained beam search with force_words_ids=["Sie"], you can guide the generation to the formal version:
Output:
Wie alt sind Sie?
This is achieved by simply passing the force words to the generate() function:
outputs = model.generate(input_ids, force_words_ids=force_words_ids, num_beams=5)
Example 2: Disjunctive Constraints
Constrained beam search also supports disjunctive constraints—allowing the model to choose between multiple options. For example, you might require the output to contain either "du" or "Sie". By specifying a list of lists of token IDs, the model will ensure at least one of the sets appear.
Under the Hood: How It Works
The implementation uses a bank system to track constraint progress. During beam search, each hypothesis maintains a set of partially satisfied constraints. Beams that have advanced further in fulfilling the constraints are kept in higher-priority banks. This ensures that constraints are satisfied as early as possible without sacrificing output quality.
Custom Constraints
Beyond simple word forcing, the Constraint class allows for custom logic. You can create constraints that must appear in order, or that must be fulfilled by a certain generation step.
Conclusion
Constrained beam search is a versatile tool for controlling text generation. It enables you to inject prior knowledge directly into the decoding process, eliminating the need for post-hoc filtering. Whether you need to force specific words, handle optional phrases, or implement complex logical conditions, this feature expands the capabilities of Hugging Face Transformers for practical applications.