Frequency Penalty
Frequency penalty reduces the chance that the model will repeat the same tokens multiple times. It helps discourage overused words or phrases in the generated output.
How does it work
As the model generates tokens, it tracks how many times each token has already appeared. The more frequently a token appears, the more its probability is penalized. This encourages the model to vary its wording rather than reuse the same terms repeatedly.
When to use frequency penalty
- When the model repeats words or phrases too often
- When you want cleaner, less redundant output
- When generating lists, paragraphs, or creative writing
- When improving the readability of longer completions
How to use frequency penalty
- Set a value between -2.0 and 2.0. Most common values range from 0.5 to 1.5
- Add frequency_penalty to your API call
- Test the output. Increase the penalty if repetition remains
- Adjust in combination with other parameters like temperature or presence_penalty
Tips
- Higher values reduce repetition but may impact coherence if set too high
- Presence penalty penalizes whether a token has appeared at all; frequency penalty is based on how many times it appears
- Use in chatbots, content generation, and summaries where repeated language is common
- Can be especially helpful when the model gets stuck repeating similar phrases